luocooong commented on a change in pull request #2419:
URL: https://github.com/apache/drill/pull/2419#discussion_r783195693
##########
File path:
contrib/format-httpd/src/main/java/org/apache/drill/exec/store/httpd/HttpdLogFormatPlugin.java
##########
@@ -40,18 +40,16 @@
private static class HttpLogReaderFactory extends FileReaderFactory {
private final HttpdLogFormatConfig config;
- private final int maxRecords;
private final EasySubScan scan;
- private HttpLogReaderFactory(HttpdLogFormatConfig config, int maxRecords,
EasySubScan scan) {
+ private HttpLogReaderFactory(HttpdLogFormatConfig config, EasySubScan
scan) {
this.config = config;
- this.maxRecords = maxRecords;
this.scan = scan;
}
@Override
- public ManagedReader<? extends FileScanFramework.FileSchemaNegotiator>
newReader() {
- return new HttpdLogBatchReader(config, maxRecords, scan);
+ public ManagedReader newReader(FileSchemaNegotiator negotiator) throws
EarlyEofException {
Review comment:
For the `EarlyEofException`, I could have two questions :
1. What happens if the specified constructor does not throw this error?
2. How will the new framework handle this error?
##########
File path:
contrib/format-httpd/src/main/java/org/apache/drill/exec/store/httpd/HttpdLogBatchReader.java
##########
@@ -40,36 +41,29 @@
import java.io.InputStream;
import java.io.InputStreamReader;
-public class HttpdLogBatchReader implements
ManagedReader<FileSchemaNegotiator> {
+public class HttpdLogBatchReader implements ManagedReader {
private static final Logger logger =
LoggerFactory.getLogger(HttpdLogBatchReader.class);
public static final String RAW_LINE_COL_NAME = "_raw";
public static final String MATCHED_COL_NAME = "_matched";
private final HttpdLogFormatConfig formatConfig;
- private final int maxRecords;
- private final EasySubScan scan;
- private HttpdParser parser;
- private FileSplit split;
+ private final HttpdParser parser;
+ private final FileDescrip file;
private InputStream fsStream;
- private RowSetLoader rowWriter;
+ private final RowSetLoader rowWriter;
private BufferedReader reader;
private int lineNumber;
- private CustomErrorContext errorContext;
- private ScalarWriter rawLineWriter;
- private ScalarWriter matchedWriter;
+ private final CustomErrorContext errorContext;
+ private final ScalarWriter rawLineWriter;
+ private final ScalarWriter matchedWriter;
private int errorCount;
-
- public HttpdLogBatchReader(HttpdLogFormatConfig formatConfig, int
maxRecords, EasySubScan scan) {
+ public HttpdLogBatchReader(HttpdLogFormatConfig formatConfig, EasySubScan
scan, FileSchemaNegotiator negotiator) {
this.formatConfig = formatConfig;
- this.maxRecords = maxRecords;
- this.scan = scan;
- }
- @Override
- public boolean open(FileSchemaNegotiator negotiator) {
Review comment:
We combined the open() and the constructor to simplify the
initialization work, and are able to define more and more final scope
variables. In the meantime, we no longer need to consider whether to return
true or false in the open(). Are there any other advantages for that?
##########
File path:
contrib/format-httpd/src/main/java/org/apache/drill/exec/store/httpd/HttpdLogFormatPlugin.java
##########
@@ -75,24 +73,15 @@ private static EasyFormatConfig easyConfig(Configuration
fsConf, HttpdLogFormatC
.fsConf(fsConf)
.defaultName(DEFAULT_NAME)
.readerOperatorType(OPERATOR_TYPE)
- .useEnhancedScan(true)
+ .scanVersion(ScanFrameworkVersion.EVF_V2)
.supportsLimitPushdown(true)
.build();
}
@Override
- public ManagedReader<? extends FileSchemaNegotiator> newBatchReader(
- EasySubScan scan, OptionManager options) {
- return new HttpdLogBatchReader(formatConfig, scan.getMaxRecords(), scan);
- }
-
- @Override
- protected FileScanFramework.FileScanBuilder frameworkBuilder(OptionManager
options, EasySubScan scan) {
Review comment:
How do we continue to pass the `options` in the *PluginFormat?
In other words, new batch reader does not require the OptionManager?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]