[
https://issues.apache.org/jira/browse/PHOENIX-5452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16912706#comment-16912706
]
Artem Ervits commented on PHOENIX-5452:
---------------------------------------
[[email protected]] how is the input file compressed, the tool expects a `.csv`
extension, tried with both `.csv` and `.tar.gz` extension. Stack track below
{code:java}
HADOOP_CLASSPATH=/opt/hbase/hbase-1.4.10/lib/hbase-protocol-1.4.10.jar:/opt/hbase/hbase-1.4.10/conf
hadoop jar
/opt/phoenix/apache-phoenix-4.14.2-HBase-1.4-bin/phoenix-4.14.2-HBase-1.4-client.jar
org.apache.phoenix.mapreduce.CsvBulkLoadTool --table EXAMPLE --input
data.csv{code}
{code:java}
Error: java.lang.RuntimeException:
org.apache.phoenix.schema.IllegalDataException: java.sql.SQLException: ERROR
201 (22000): Illegal data.
at
org.apache.phoenix.mapreduce.FormatToBytesWritableMapper.map(FormatToBytesWritableMapper.java:205)
at
org.apache.phoenix.mapreduce.FormatToBytesWritableMapper.map(FormatToBytesWritableMapper.java:77)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:175)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:169)
Caused by: org.apache.phoenix.schema.IllegalDataException:
java.sql.SQLException: ERROR 201 (22000): Illegal data.
at
org.apache.phoenix.schema.types.PDataType.newIllegalDataException(PDataType.java:310)
at org.apache.phoenix.schema.types.PLong.toObject(PLong.java:228)
at
org.apache.phoenix.util.csv.CsvUpsertExecutor$SimpleDatatypeConversionFunction.apply(CsvUpsertExecutor.java:204)
at
org.apache.phoenix.util.csv.CsvUpsertExecutor$SimpleDatatypeConversionFunction.apply(CsvUpsertExecutor.java:120)
at
org.apache.phoenix.util.csv.CsvUpsertExecutor.execute(CsvUpsertExecutor.java:85)
at
org.apache.phoenix.util.csv.CsvUpsertExecutor.execute(CsvUpsertExecutor.java:52)
at org.apache.phoenix.util.UpsertExecutor.execute(UpsertExecutor.java:133)
at
org.apache.phoenix.mapreduce.FormatToBytesWritableMapper.map(FormatToBytesWritableMapper.java:174)
... 9 more
Caused by: java.sql.SQLException: ERROR 201 (22000): Illegal data.
at
org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:498)
at
org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:150)
... 17 more
Caused by: java.lang.NumberFormatException: For input string:
"data.csv0000664000175000017500000005065313527306733012416 0ustar
vagrantvagrant71027"
at
java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Long.parseLong(Long.java:589)
at java.lang.Long.parseLong(Long.java:631)
at org.apache.phoenix.schema.types.PLong.toObject(PLong.java:226)
... 15 more{code}
{code:java}
Error: java.lang.RuntimeException: java.lang.IllegalArgumentException: CSV
record does not have enough values (has 1, but needs 3)
at
org.apache.phoenix.mapreduce.FormatToBytesWritableMapper.map(FormatToBytesWritableMapper.java:205)
at
org.apache.phoenix.mapreduce.FormatToBytesWritableMapper.map(FormatToBytesWritableMapper.java:77)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:175)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:169)
Caused by: java.lang.IllegalArgumentException: CSV record does not have enough
values (has 1, but needs 3)
at
org.apache.phoenix.util.csv.CsvUpsertExecutor.execute(CsvUpsertExecutor.java:82)
at
org.apache.phoenix.util.csv.CsvUpsertExecutor.execute(CsvUpsertExecutor.java:52)
at org.apache.phoenix.util.UpsertExecutor.execute(UpsertExecutor.java:133)
at
org.apache.phoenix.mapreduce.FormatToBytesWritableMapper.map(FormatToBytesWritableMapper.java:174)
... 9 more{code}
> extend org.apache.phoenix.mapreduce.CsvBulkLoadTool to accept compressed
> csv/json files
> ---------------------------------------------------------------------------------------
>
> Key: PHOENIX-5452
> URL: https://issues.apache.org/jira/browse/PHOENIX-5452
> Project: Phoenix
> Issue Type: Improvement
> Affects Versions: 4.14.2
> Reporter: Artem Ervits
> Priority: Major
>
> today, the bulk import tool only accepts uncompressed csv files, this Jira is
> to add ability to import compressed data.
> {code:java}
> Error: java.lang.RuntimeException:
> org.apache.phoenix.schema.IllegalDataException: java.sql.SQLException: ERROR
> 201 (22000): Illegal data.Error: java.lang.RuntimeException:
> org.apache.phoenix.schema.IllegalDataException: java.sql.SQLException: ERROR
> 201 (22000): Illegal data. at
> org.apache.phoenix.mapreduce.FormatToBytesWritableMapper.map(FormatToBytesWritableMapper.java:205)
> at
> org.apache.phoenix.mapreduce.FormatToBytesWritableMapper.map(FormatToBytesWritableMapper.java:77)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146) at
> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) at
> org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at
> org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:175) at
> java.security.AccessController.doPrivileged(Native Method) at
> javax.security.auth.Subject.doAs(Subject.java:422) at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:169)Caused by:
> org.apache.phoenix.schema.IllegalDataException: java.sql.SQLException: ERROR
> 201 (22000): Illegal data. at
> org.apache.phoenix.schema.types.PDataType.newIllegalDataException(PDataType.java:310)
> at org.apache.phoenix.schema.types.PLong.toObject(PLong.java:228) at
> org.apache.phoenix.util.csv.CsvUpsertExecutor$SimpleDatatypeConversionFunction.apply(CsvUpsertExecutor.java:204)
> at
> org.apache.phoenix.util.csv.CsvUpsertExecutor$SimpleDatatypeConversionFunction.apply(CsvUpsertExecutor.java:120)
> at
> org.apache.phoenix.util.csv.CsvUpsertExecutor.execute(CsvUpsertExecutor.java:85)
> at
> org.apache.phoenix.util.csv.CsvUpsertExecutor.execute(CsvUpsertExecutor.java:52)
> at org.apache.phoenix.util.UpsertExecutor.execute(UpsertExecutor.java:133)
> at
> org.apache.phoenix.mapreduce.FormatToBytesWritableMapper.map(FormatToBytesWritableMapper.java:174)
> ... 9 moreCaused by: java.sql.SQLException: ERROR 201 (22000): Illegal data.
> at
> org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:498)
> at
> org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:150)
> ... 17 moreCaused by: java.lang.NumberFormatException: For input string:
> "data.csv0000664000175000017500000005065313527306733012416 0ustar
> vagrantvagrant71027" at
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
> at java.lang.Long.parseLong(Long.java:589) at
> java.lang.Long.parseLong(Long.java:631) at
> org.apache.phoenix.schema.types.PLong.toObject(PLong.java:226) ... 15 more
> {code}
--
This message was sent by Atlassian Jira
(v8.3.2#803003)