[
https://issues.apache.org/jira/browse/PARQUET-2330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Gang Wu updated PARQUET-2330:
-----------------------------
Fix Version/s: 1.14.0
> Fix convert-csv to show the correct position of the invalid record
> ------------------------------------------------------------------
>
> Key: PARQUET-2330
> URL: https://issues.apache.org/jira/browse/PARQUET-2330
> Project: Parquet
> Issue Type: Bug
> Components: parquet-cli
> Reporter: Kengo Seki
> Assignee: Kengo Seki
> Priority: Minor
> Fix For: 1.14.0
>
>
> Given the following input:
> {code}
> $ cat /tmp/input
> 0
> 1
> 2
> 3
> 4
> 5
> 6
> 7
> 8
> 9
> a
> {code}
> running the convert-csv subcommand shows a wrong position (0) for the invalid
> record, as follows:
> {code}
> $ java -cp 'target/parquet-cli-1.14.0-SNAPSHOT.jar:target/dependency/*'
> org.apache.parquet.cli.Main convert-csv /tmp/input --no-header -o /tmp/output
> Unknown error
> java.lang.RuntimeException: Failed on record 0
> at
> org.apache.parquet.cli.commands.ConvertCSVCommand.run(ConvertCSVCommand.java:186)
> at org.apache.parquet.cli.Main.run(Main.java:163)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:82)
> at org.apache.parquet.cli.Main.main(Main.java:193)
> Caused by: org.apache.parquet.cli.util.RecordException: Field field_0: value
> not a ["null","long"]: 'a'
> at
> org.apache.parquet.cli.csv.RecordBuilder.makeValue(RecordBuilder.java:125)
> at
> org.apache.parquet.cli.csv.RecordBuilder.fillIndexed(RecordBuilder.java:98)
> at
> org.apache.parquet.cli.csv.RecordBuilder.makeRecord(RecordBuilder.java:75)
> at org.apache.parquet.cli.csv.AvroCSVReader.next(AvroCSVReader.java:84)
> at
> org.apache.parquet.cli.commands.ConvertCSVCommand.run(ConvertCSVCommand.java:182)
> ... 3 more
> Caused by: java.lang.NumberFormatException: For input string: "a"
> at
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
> at java.lang.Long.parseLong(Long.java:589)
> at java.lang.Long.valueOf(Long.java:803)
> at
> org.apache.parquet.cli.csv.RecordBuilder.makeValue(RecordBuilder.java:163)
> at
> org.apache.parquet.cli.csv.RecordBuilder.makeValue(RecordBuilder.java:178)
> at
> org.apache.parquet.cli.csv.RecordBuilder.makeValue(RecordBuilder.java:113)
> ... 7 more
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)