Invoking org.apache.phoenix.mapreduce.CsvBulkLoadTool from phoenix-4.4.0.2.4.0.0-169-client.jar is not working properly

Radha Krishna G Wed, 03 Aug 2016 00:58:01 -0700

Hi All,i am trying to load around 40 GB file using 
"org.apache.phoenix.mapreduce.CsvBulkLoadTool" but it is showing the below 
error message.
INFO mapreduce.Job: Task Id : attempt_1469663368297_56967_m_000042_0, Status : 
FAILEDError: java.lang.RuntimeException: java.lang.RuntimeException: 
java.io.IOException: (startline 1) EOF reached before encapsulated token 
finished        at 
org.apache.phoenix.mapreduce.CsvToKeyValueMapper.map(CsvToKeyValueMapper.java:176)
        at 
org.apache.phoenix.mapreduce.CsvToKeyValueMapper.map(CsvToKeyValueMapper.java:67)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)        at 
org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)        at 
org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)        at 
org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)        at 
java.security.AccessController.doPrivileged(Native Method)        at 
javax.security.auth.Subject.doAs(Subject.java:422)        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)Caused 
by: java.lang.RuntimeException: java.io.IOException: (startline 1) EOF reached 
before encapsulated token finished        at 
org.apache.commons.csv.CSVParser$1.getNextRecord(CSVParser.java:398)        at 
org.apache.commons.csv.CSVParser$1.hasNext(CSVParser.java:407)        at 
com.google.common.collect.Iterators.getNext(Iterators.java:890)        at 
com.google.common.collect.Iterables.getFirst(Iterables.java:781)        at 
org.apache.phoenix.mapreduce.CsvToKeyValueMapper$CsvLineParser.parse(CsvToKeyValueMapper.java:287)
        at 
org.apache.phoenix.mapreduce.CsvToKeyValueMapper.map(CsvToKeyValueMapper.java:148)
        ... 9 moreCaused by: java.io.IOException: (startline 1) EOF reached 
before encapsulated token finished        at 
org.apache.commons.csv.Lexer.parseEncapsulatedToken(Lexer.java:282)        at 
org.apache.commons.csv.Lexer.nextToken(Lexer.java:152)        at 
org.apache.commons.csv.CSVParser.nextRecord(CSVParser.java:450)        at 
org.apache.commons.csv.CSVParser$1.getNextRecord(CSVParser.java:395)        ... 
14 more


Note : I collected some sample records around(1000) form the same file and able 
to load using the same approach, but if i provide full file path its failing, 
can any one suggest what is solution for the above issue..
Bellow Command i used==================
HADOOP_CLASSPATH=/usr/hdp/current/phoenix-client/lib/hbase-protocol.jar:/usr/hdp/current/hbase-client/conf
 hadoop jar phoenix-4.4.0.2.4.0.0-169-client.jar 
org.apache.phoenix.mapreduce.CsvBulkLoadTool --table "Table_Name" --input "HDFS 
input file path" -d $'\034'


-d $'\034' --> the field separator in the file is FS so we provided the 
explicitly  

RegardsRadha krishna G

Invoking org.apache.phoenix.mapreduce.CsvBulkLoadTool from phoenix-4.4.0.2.4.0.0-169-client.jar is not working properly

Reply via email to