[
https://issues.apache.org/jira/browse/PHOENIX-5258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16832716#comment-16832716
]
Josh Elser commented on PHOENIX-5258:
-------------------------------------
{code:java}
+ try(FSDataInputStream inputStream = fs.open(new Path(path))) {
+ String header = new BufferedReader(new
InputStreamReader(inputStream)).readLine();
+ inputStream.close();
+ return header;
+ }
{code}
Closing the inputStream when you are using try-with-resources is unnecessary.
Can you please create the BufferedReader within the try-with-resources as well?
e.g.
{code:java}
try (FSDatInputStream inputStream = fs.open(new Path(path));
Reader reader = new BufferedReader(new InputStreamReader(inputStream))) {
return header.readLine();
}{code}
Some test cases which look to be missing:
* What happens if the user provides {{--header}} but there is no header on the
CSV file? (should error)
* What happens if the user provides both {{--header}} and {{--skip-header}}?
(should error)
Looks pretty close otherwise. Good work.
> Add support to parse header from the input CSV file as input columns for
> CsvBulkLoadTool
> ----------------------------------------------------------------------------------------
>
> Key: PHOENIX-5258
> URL: https://issues.apache.org/jira/browse/PHOENIX-5258
> Project: Phoenix
> Issue Type: Improvement
> Reporter: Prashant Vithani
> Priority: Minor
> Fix For: 4.15.0, 5.1.0
>
> Attachments: PHOENIX-5258-4.x-HBase-1.4.patch,
> PHOENIX-5258-master.patch
>
> Time Spent: 40m
> Remaining Estimate: 0h
>
> Currently, CsvBulkLoadTool does not support reading header from the input csv
> and expects the content of the csv to match with the table schema. The
> support for the header can be added to dynamically map the schema with the
> header.
> The proposed solution is to introduce another option for the tool `–header`.
> If this option is passed, the input columns list is constructed by reading
> the first line of the input CSV file.
> * If there is only one file, read the header from the first line and
> generate the `ColumnInfo` list.
> * If there are multiple files, read the header from all the files, and throw
> an error if the headers across files do not match.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)