[jira] [Commented] (PHOENIX-6721) CSV bulkload tool fails with FileNotFoundException if --output points to the S3 location

Istvan Toth (Jira) Tue, 01 Aug 2023 00:58:06 -0700


    [ 
https://issues.apache.org/jira/browse/PHOENIX-6721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17749526#comment-17749526
 ]


Istvan Toth commented on PHOENIX-6721:
--------------------------------------

 

This is the stack trace:
{noformat}
INFO mapreduce.AbstractBulkLoadTool: Loading HFiles from 
s3a://odx-qe-bucket/odx-otgvke/audit/cod-brun5ux2kmmd/hbase/output
INFO Configuration.deprecation: io.bytes.per.checksum is deprecated. Instead, 
use dfs.bytes-per-checksum
INFO zookeeper.ZooKeeper: Initiating client connection, 
connectString=cod-brun5ux2kmmd-leader0.odx-otgv.svbr-nqvp.int.cldr.work:2181 
INFO zookeeper.ClientCnxnSocket: jute.maxbuffer value is 4194304 Bytes
NFO zookeeper.ClientCnxn: zookeeper.request.timeout value is 0. feature enabled=
INFO zookeeper.ClientCnxn: Opening socket connection to server 
cod-brun5ux2kmmd-leader0.odx-otgv.svbr-nqvp.int.cldr.work/10.80.170.105:2181. 
Will not attempt to authenticate using SASL (unknown error)
NFO zookeeper.ClientCnxn: Socket connection established, initiating session, 
client: /10.106.18.86:41030, server: 
cod-brun5ux2kmmd-leader0.odx-otgv.svbr-nqvp.int.cldr.work/10.80.170.105:2181
INFO zookeeper.ClientCnxn: Session establishment complete on server 
cod-brun5ux2kmmd-leader0.odx-otgv.svbr-nqvp.int.cldr.work/10.80.170.105:2181, 
sessionid = 0x100000cb89f03f3, negotiated timeout = 60000
INFO mapreduce.AbstractBulkLoadTool: Loading HFiles for LARGE_TABLE from 
s3a://odx-qe-bucket/odx-otgvke/audit/cod-brun5ux2kmmd/hbase/output/LARGE_TABLE
Exception in thread "main" java.io.FileNotFoundException: No such file or 
directory: 
s3a://odx-qe-bucket/odx-otgvke/audit/cod-brun5ux2kmmd/hbase/output/LARGE_TABLE
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:3819)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:3641)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.innerListStatus(S3AFileSystem.java:3253)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$null$20(S3AFileSystem.java:3217)
at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:117)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$listStatus$21(S3AFileSystem.java:3216)
at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:499)
at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:444)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2290)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2309)
at org.apache.hadoop.fs.s3a.S3AFileSystem.listStatus(S3AFileSystem.java:3215)
at 
org.apache.hadoop.hbase.tool.LoadIncrementalHFiles.visitBulkHFiles(LoadIncrementalHFiles.java:1051)
at 
org.apache.hadoop.hbase.tool.LoadIncrementalHFiles.discoverLoadQueue(LoadIncrementalHFiles.java:1009)
at 
org.apache.hadoop.hbase.tool.LoadIncrementalHFiles.prepareHFileQueue(LoadIncrementalHFiles.java:248)
at 
org.apache.hadoop.hbase.tool.LoadIncrementalHFiles.doBulkLoad(LoadIncrementalHFiles.java:355)
at 
org.apache.hadoop.hbase.tool.LoadIncrementalHFiles.doBulkLoad(LoadIncrementalHFiles.java:280)
at 
org.apache.phoenix.mapreduce.AbstractBulkLoadTool.completebulkload(AbstractBulkLoadTool.java:371)
at 
org.apache.phoenix.mapreduce.AbstractBulkLoadTool.submitJob(AbstractBulkLoadTool.java:345)
at 
org.apache.phoenix.mapreduce.AbstractBulkLoadTool.loadData(AbstractBulkLoadTool.java:275)
at 
org.apache.phoenix.mapreduce.AbstractBulkLoadTool.run(AbstractBulkLoadTool.java:181){noformat}
 

 

> CSV bulkload tool fails with FileNotFoundException if --output points to the 
> S3 location
> ----------------------------------------------------------------------------------------
>
>                 Key: PHOENIX-6721
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-6721
>             Project: Phoenix
>          Issue Type: Bug
>          Components: core
>            Reporter: Sergey Soldatov
>            Assignee: Sergey Soldatov
>            Priority: Major
>
> We were trying to use CSV bulkload tool with the HBase/Phoenix running on top 
> of AWS S3 and found that once we use --output params pointing to  S3, the job 
> fails with FNFE



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (PHOENIX-6721) CSV bulkload tool fails with FileNotFoundException if --output points to the S3 location

Reply via email to