Aritra Nayak created PHOENIX-5361:
-------------------------------------
Summary: FileNotFoundException found when schema is in lowercase
Key: PHOENIX-5361
URL: https://issues.apache.org/jira/browse/PHOENIX-5361
Project: Phoenix
Issue Type: Bug
Affects Versions: 4.13.0
Environment: *Hadoop*: 2.6.0-cdh5.9.2
*Phoenix*: 4.13
*HBase*: 1.2.0-cdh5.9.2
*Java*: 8
Reporter: Aritra Nayak
The table name (DUMMY_DATA) is in uppercase, but the schema name (s01) is in
lowercase.
Steps to reproduce:
# Create the Hive table:
{code:java}
CREATE TABLE IF NOT EXISTS "s01"."DUMMY_DATA"("id" BIGINT BIGINT PRIMARY KEY,
"firstName" VARCHAR, "lastName" VARCHAR);
{code}
# Upload the CSV file in your preferred HDFS location
{code:java}
/data/s01/DUMMY_DATA/1.csv{code}
# Run the hadoop jar command to bulk upload
{code:java}
hadoop jar
/opt/phoenix/phoenix4.13-cdh5.9.2-marin-1.5.1/phoenix4.13-cdh5.9.2-marin-1.5.1-client.jar
org.apache.phoenix.mapreduce.CsvBulkLoadTool --s \"\"s01\"\" --t DUMMY_DATA
--input /data/s01/DUMMY_DATA/1.csv --zookeeper zk-journalnode-lv-101:2181
{code}
Getting the below error:
{code:java}
Exception in thread "main" java.io.FileNotFoundException: Bulkload dir
/tmp/94ea4875-3453-4ed6-823d-3544ff05fd56/s01.DUMMY_DATA not found
at
org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.visitBulkHFiles(LoadIncrementalHFiles.java:194)
at
org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.discoverLoadQueue(LoadIncrementalHFiles.java:289)
at
org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.doBulkLoad(LoadIncrementalHFiles.java:393)
at
org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.doBulkLoad(LoadIncrementalHFiles.java:339)
at
org.apache.phoenix.mapreduce.AbstractBulkLoadTool.completebulkload(AbstractBulkLoadTool.java:355)
at
org.apache.phoenix.mapreduce.AbstractBulkLoadTool.submitJob(AbstractBulkLoadTool.java:332)
at
org.apache.phoenix.mapreduce.AbstractBulkLoadTool.loadData(AbstractBulkLoadTool.java:270)
at
org.apache.phoenix.mapreduce.AbstractBulkLoadTool.run(AbstractBulkLoadTool.java:183)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at
org.apache.phoenix.mapreduce.CsvBulkLoadTool.main(CsvBulkLoadTool.java:109)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
{code}
The Map Reduce job reads 100_000 records, but does not write any
{code:java}
19/06/18 20:06:24 INFO mapreduce.Job: Counters: 50
File System Counters
FILE: Number of bytes read=20
FILE: Number of bytes written=315801
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=41666811
HDFS: Number of bytes written=0
HDFS: Number of read operations=4
HDFS: Number of large read operations=0
HDFS: Number of write operations=0
Job Counters
Launched map tasks=1
Launched reduce tasks=1
Data-local map tasks=1
Total time spent by all maps in occupied slots (ms)=39894
Total time spent by all reduces in occupied slots (ms)=56216
Total time spent by all map tasks (ms)=19947
Total time spent by all reduce tasks (ms)=14054
Total vcore-seconds taken by all map tasks=19947
Total vcore-seconds taken by all reduce tasks=14054
Total megabyte-seconds taken by all map tasks=40851456
Total megabyte-seconds taken by all reduce tasks=57565184
Map-Reduce Framework
Map input records=1000000
Map output records=0 <----- see here
Map output bytes=0
Map output materialized bytes=16
Input split bytes=123
Combine input records=0
Combine output records=0
Reduce input groups=0
Reduce shuffle bytes=16
Reduce input records=0
Reduce output records=0
Spilled Records=0
Shuffled Maps =1
Failed Shuffles=0
Merged Map outputs=1
GC time elapsed (ms)=914
CPU time spent (ms)=49240
Physical memory (bytes) snapshot=2022809600
Virtual memory (bytes) snapshot=8064647168
Total committed heap usage (bytes)=3589275648
Phoenix MapReduce Import
Upserts Done=1000000
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=41666688
File Output Format Counters
Bytes Written=0
{code}
{color:#14892c}Same steps (1-3) when followed with schema name S01, passes
and data gets successfully uploaded into the table{color}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)