[ 
https://issues.apache.org/jira/browse/IMPALA-8189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16766289#comment-16766289
 ] 

Pooja Nilangekar commented on IMPALA-8189:
------------------------------------------

I'll take a look at this, it looks like this is due to the s3 eventual 
consistency issues. 

> TestParquet.test_resolution_by_name fails on S3 because 'hadoop fs -cp'  fails
> ------------------------------------------------------------------------------
>
>                 Key: IMPALA-8189
>                 URL: https://issues.apache.org/jira/browse/IMPALA-8189
>             Project: IMPALA
>          Issue Type: Bug
>            Reporter: Andrew Sherman
>            Assignee: Pooja Nilangekar
>            Priority: Major
>
> In parquet-resolution-by-name.test a parquet file is copied. 
> {quote}
> ---- SHELL
> hadoop fs -cp 
> $FILESYSTEM_PREFIX/test-warehouse/complextypestbl_parquet/nullable.parq \
> $FILESYSTEM_PREFIX/test-warehouse/$DATABASE.db/nested_resolution_by_name_test/
> hadoop fs -cp 
> $FILESYSTEM_PREFIX/test-warehouse/complextypestbl_parquet/nonnullable.parq \
> $FILESYSTEM_PREFIX/test-warehouse/$DATABASE.db/nested_resolution_by_name_test/
> {quote}
> The first copy succeeds, but the second fails. In the DEBUG output (below) 
> you can see the copy writing data to an intermediate file 
> test-warehouse/test_resolution_by_name_daec05d5.db/nested_resolution_by_name_test/nonnullable.parq._COPYING_
>  and then after the stream is closed, the copy cannot find the file.
> {quote}
> 19/02/12 05:33:13 DEBUG s3a.S3AFileSystem: Getting path status for 
> s3a://impala-test-uswest2-1/test-warehouse/test_resolution_by_name_daec05d5.db/nested_resolution_by_name_test/nonnullable.parq._COPYING_
>   
> (test-warehouse/test_resolution_by_name_daec05d5.db/nested_resolution_by_name_test/nonnullable.parq._COPYING_)
> 19/02/12 05:33:13 DEBUG s3a.S3AStorageStatistics: object_metadata_requests += 
> 1  ->  7
> 19/02/12 05:33:13 DEBUG s3a.S3AStorageStatistics: object_metadata_requests += 
> 1  ->  8
> 19/02/12 05:33:13 DEBUG s3a.S3AStorageStatistics: object_list_requests += 1  
> ->  3
> 19/02/12 05:33:13 DEBUG s3a.S3AFileSystem: Not Found: 
> s3a://impala-test-uswest2-1/test-warehouse/test_resolution_by_name_daec05d5.db/nested_resolution_by_name_test/nonnullable.parq._COPYING_
> 19/02/12 05:33:13 DEBUG s3a.S3AStorageStatistics: op_create += 1  ->  1
> 19/02/12 05:33:13 DEBUG s3a.S3AStorageStatistics: op_get_file_status += 1  -> 
>  6
> 19/02/12 05:33:13 DEBUG s3a.S3AFileSystem: Getting path status for 
> s3a://impala-test-uswest2-1/test-warehouse/test_resolution_by_name_daec05d5.db/nested_resolution_by_name_test/nonnullable.parq._COPYING_
>   
> (test-warehouse/test_resolution_by_name_daec05d5.db/nested_resolution_by_name_test/nonnullable.parq._COPYING_)
> 19/02/12 05:33:13 DEBUG s3a.S3AStorageStatistics: object_metadata_requests += 
> 1  ->  9
> 19/02/12 05:33:13 DEBUG s3a.S3AStorageStatistics: object_metadata_requests += 
> 1  ->  10
> 19/02/12 05:33:13 DEBUG s3a.S3AStorageStatistics: object_list_requests += 1  
> ->  4
> 19/02/12 05:33:13 DEBUG s3a.S3AFileSystem: Not Found: 
> s3a://impala-test-uswest2-1/test-warehouse/test_resolution_by_name_daec05d5.db/nested_resolution_by_name_test/nonnullable.parq._COPYING_
> 19/02/12 05:33:13 DEBUG s3a.S3ABlockOutputStream: Initialized 
> S3ABlockOutputStream for 
> test-warehouse/test_resolution_by_name_daec05d5.db/nested_resolution_by_name_test/nonnullable.parq._COPYING_
>  output to FileBlock{index=1, 
> destFile=/tmp/hadoop-jenkins/s3a/s3ablock-0001-1315190405959387081.tmp, 
> state=Writing, dataSize=0, limit=104857600}
> 19/02/12 05:33:13 DEBUG s3a.S3AStorageStatistics: op_get_file_status += 1  -> 
>  7
> 19/02/12 05:33:13 DEBUG s3a.S3AFileSystem: Getting path status for 
> s3a://impala-test-uswest2-1/test-warehouse/test_resolution_by_name_daec05d5.db/nested_resolution_by_name_test/nonnullable.parq._COPYING_
>   
> (test-warehouse/test_resolution_by_name_daec05d5.db/nested_resolution_by_name_test/nonnullable.parq._COPYING_)
> 19/02/12 05:33:13 DEBUG s3a.S3AStorageStatistics: object_metadata_requests += 
> 1  ->  11
> 19/02/12 05:33:13 DEBUG s3a.S3AStorageStatistics: object_metadata_requests += 
> 1  ->  12
> 19/02/12 05:33:13 DEBUG s3a.S3AStorageStatistics: object_list_requests += 1  
> ->  5
> 19/02/12 05:33:13 DEBUG s3a.S3AFileSystem: Not Found: 
> s3a://impala-test-uswest2-1/test-warehouse/test_resolution_by_name_daec05d5.db/nested_resolution_by_name_test/nonnullable.parq._COPYING_
> 19/02/12 05:33:13 DEBUG s3a.S3AInputStream: 
> reopen(s3a://impala-test-uswest2-1/test-warehouse/complextypestbl_parquet/nonnullable.parq)
>  for read from new offset range[0-3186], length=4096, streamPosition=0, 
> nextReadPosition=0, policy=normal
> 19/02/12 05:33:13 DEBUG s3a.S3ABlockOutputStream: 
> S3ABlockOutputStream{WriteOperationHelper {bucket=impala-test-uswest2-1}, 
> blockSize=104857600, activeBlock=FileBlock{index=1, 
> destFile=/tmp/hadoop-jenkins/s3a/s3ablock-0001-1315190405959387081.tmp, 
> state=Writing, dataSize=3186, limit=104857600}}: Closing block #1: current 
> block= FileBlock{index=1, 
> destFile=/tmp/hadoop-jenkins/s3a/s3ablock-0001-1315190405959387081.tmp, 
> state=Writing, dataSize=3186, limit=104857600}
> 19/02/12 05:33:13 DEBUG s3a.S3ABlockOutputStream: Executing regular upload 
> for WriteOperationHelper {bucket=impala-test-uswest2-1}
> 19/02/12 05:33:13 DEBUG s3a.S3ADataBlocks: Start datablock[1] upload
> 19/02/12 05:33:13 DEBUG s3a.S3ADataBlocks: FileBlock{index=1, 
> destFile=/tmp/hadoop-jenkins/s3a/s3ablock-0001-1315190405959387081.tmp, 
> state=Writing, dataSize=3186, limit=104857600}: entering state Upload
> 19/02/12 05:33:13 DEBUG s3a.S3ABlockOutputStream: Clearing active block
> 19/02/12 05:33:13 DEBUG s3a.S3AFileSystem: PUT 3186 bytes to 
> test-warehouse/test_resolution_by_name_daec05d5.db/nested_resolution_by_name_test/nonnullable.parq._COPYING_
> 19/02/12 05:33:13 DEBUG s3a.S3AFileSystem: PUT start 3186 bytes
> 19/02/12 05:33:13 DEBUG s3a.S3AStorageStatistics: object_put_requests += 1  
> ->  1
> 19/02/12 05:33:13 DEBUG s3a.S3AFileSystem: PUT completed success=true; 3186 
> bytes
> 19/02/12 05:33:13 DEBUG s3a.S3AStorageStatistics: object_put_bytes += 3186  
> ->  3186
> 19/02/12 05:33:13 DEBUG s3a.S3AStorageStatistics: 
> object_put_requests_completed += 1  ->  1
> 19/02/12 05:33:13 DEBUG s3a.S3AFileSystem: Finished write to 
> test-warehouse/test_resolution_by_name_daec05d5.db/nested_resolution_by_name_test/nonnullable.parq._COPYING_,
>  len 3186
> 19/02/12 05:33:13 DEBUG s3a.S3AStorageStatistics: object_delete_requests += 1 
>  ->  1
> 19/02/12 05:33:13 DEBUG s3a.S3ABlockOutputStream: Closing 
> org.apache.hadoop.fs.s3a.S3ADataBlocks$BlockUploadData@59b5cc66
> 19/02/12 05:33:13 DEBUG s3a.S3ABlockOutputStream: Closing FileBlock{index=1, 
> destFile=/tmp/hadoop-jenkins/s3a/s3ablock-0001-1315190405959387081.tmp, 
> state=Upload, dataSize=3186, limit=104857600}
> 19/02/12 05:33:13 DEBUG s3a.S3ADataBlocks: FileBlock{index=1, 
> destFile=/tmp/hadoop-jenkins/s3a/s3ablock-0001-1315190405959387081.tmp, 
> state=Upload, dataSize=3186, limit=104857600}: entering state Closed
> 19/02/12 05:33:13 DEBUG s3a.S3ADataBlocks: Closed FileBlock{index=1, 
> destFile=/tmp/hadoop-jenkins/s3a/s3ablock-0001-1315190405959387081.tmp, 
> state=Closed, dataSize=3186, limit=104857600}
> 19/02/12 05:33:13 DEBUG s3a.S3ADataBlocks: Closing FileBlock{index=1, 
> destFile=/tmp/hadoop-jenkins/s3a/s3ablock-0001-1315190405959387081.tmp, 
> state=Closed, dataSize=3186, limit=104857600}
> 19/02/12 05:33:13 DEBUG s3a.S3ADataBlocks: block[1]: closeBlock()
> 19/02/12 05:33:13 DEBUG s3a.S3ABlockOutputStream: Upload complete to 
> test-warehouse/test_resolution_by_name_daec05d5.db/nested_resolution_by_name_test/nonnullable.parq._COPYING_
>  by WriteOperationHelper {bucket=impala-test-uswest2-1}
> 19/02/12 05:33:13 DEBUG s3a.S3ABlockOutputStream: Closing FileBlock{index=1, 
> destFile=/tmp/hadoop-jenkins/s3a/s3ablock-0001-1315190405959387081.tmp, 
> state=Closed, dataSize=3186, limit=104857600}
> 19/02/12 05:33:13 DEBUG s3a.S3ABlockOutputStream: Closing 
> org.apache.hadoop.fs.s3a.S3ADataBlocks$DiskBlockFactory@4d157787
> 19/02/12 05:33:13 DEBUG s3a.S3ABlockOutputStream: Statistics: 
> OutputStreamStatistics{blocksSubmitted=1, blocksInQueue=1, blocksActive=0, 
> blockUploadsCompleted=0, blockUploadsFailed=0, bytesPendingUpload=0, 
> bytesUploaded=3186, blocksAllocated=1, blocksReleased=1, 
> blocksActivelyAllocated=0, exceptionsInMultipartFinalize=0, 
> transferDuration=0 ms, queueDuration=0 ms, averageQueueTime=0 ms, 
> totalUploadDuration=0 ms, effectiveBandwidth=0.0 bytes/s}
> 19/02/12 05:33:13 DEBUG s3a.S3ABlockOutputStream: Closing 
> OutputStreamStatistics{blocksSubmitted=1, blocksInQueue=1, blocksActive=0, 
> blockUploadsCompleted=0, blockUploadsFailed=0, bytesPendingUpload=0, 
> bytesUploaded=3186, blocksAllocated=1, blocksReleased=1, 
> blocksActivelyAllocated=0, exceptionsInMultipartFinalize=0, 
> transferDuration=0 ms, queueDuration=0 ms, averageQueueTime=0 ms, 
> totalUploadDuration=0 ms, effectiveBandwidth=0.0 bytes/s}
> 19/02/12 05:33:13 DEBUG s3a.S3AInputStream: Closing stream close() operation: 
> soft
> 19/02/12 05:33:13 DEBUG s3a.S3AInputStream: Drained stream of 0 bytes
> 19/02/12 05:33:13 DEBUG s3a.S3AInputStream: Stream 
> s3a://impala-test-uswest2-1/test-warehouse/complextypestbl_parquet/nonnullable.parq
>  closed: close() operation; remaining=0 streamPos=3186, nextReadPos=3186, 
> request range 0-3186 length=3186
> 19/02/12 05:33:13 DEBUG s3a.S3AInputStream: Statistics of stream 
> test-warehouse/complextypestbl_parquet/nonnullable.parq
> StreamStatistics{OpenOperations=1, CloseOperations=1, Closed=1, Aborted=0, 
> SeekOperations=0, ReadExceptions=0, ForwardSeekOperations=0, 
> BackwardSeekOperations=0, BytesSkippedOnSeek=0, BytesBackwardsOnSeek=0, 
> BytesRead=3186, BytesRead excluding skipped=3186, ReadOperations=1, 
> ReadFullyOperations=0, ReadsIncomplete=1, BytesReadInClose=0, 
> BytesDiscardedInAbort=0, InputPolicy=0, InputPolicySetCount=1}
> 19/02/12 05:33:13 DEBUG s3a.S3ABlockOutputStream: Ignoring close() as stream 
> is already closed
> 19/02/12 05:33:13 DEBUG s3a.S3AFileSystem: Rename path 
> s3a://impala-test-uswest2-1/test-warehouse/test_resolution_by_name_daec05d5.db/nested_resolution_by_name_test/nonnullable.parq._COPYING_
>  to 
> s3a://impala-test-uswest2-1/test-warehouse/test_resolution_by_name_daec05d5.db/nested_resolution_by_name_test/nonnullable.parq
> 19/02/12 05:33:13 DEBUG s3a.S3AStorageStatistics: op_rename += 1  ->  1
> 19/02/12 05:33:13 DEBUG s3a.S3AStorageStatistics: op_get_file_status += 1  -> 
>  8
> 19/02/12 05:33:13 DEBUG s3a.S3AFileSystem: Getting path status for 
> s3a://impala-test-uswest2-1/test-warehouse/test_resolution_by_name_daec05d5.db/nested_resolution_by_name_test/nonnullable.parq._COPYING_
>   
> (test-warehouse/test_resolution_by_name_daec05d5.db/nested_resolution_by_name_test/nonnullable.parq._COPYING_)
> 19/02/12 05:33:13 DEBUG s3a.S3AStorageStatistics: object_metadata_requests += 
> 1  ->  13
> 19/02/12 05:33:13 DEBUG s3a.S3AStorageStatistics: object_metadata_requests += 
> 1  ->  14
> 19/02/12 05:33:13 DEBUG s3a.S3AStorageStatistics: object_list_requests += 1  
> ->  6
> 19/02/12 05:33:14 DEBUG s3a.S3AFileSystem: Not Found: 
> s3a://impala-test-uswest2-1/test-warehouse/test_resolution_by_name_daec05d5.db/nested_resolution_by_name_test/nonnullable.parq._COPYING_
> 19/02/12 05:33:14 DEBUG s3a.S3AFileSystem: java.io.FileNotFoundException: No 
> such file or directory: 
> s3a://impala-test-uswest2-1/test-warehouse/test_resolution_by_name_daec05d5.db/nested_resolution_by_name_test/nonnullable.parq._COPYING_
> cp: rename 
> `/test-warehouse/test_resolution_by_name_daec05d5.db/nested_resolution_by_name_test/nonnullable.parq._COPYING_'
>  to 
> `/test-warehouse/test_resolution_by_name_daec05d5.db/nested_resolution_by_name_test/nonnullable.parq':
>  Input/output error
> 19/02/12 05:33:14 DEBUG s3a.S3AFileSystem: Filesystem 
> s3a://impala-test-uswest2-1 is closed
> 19/02/12 05:33:14 INFO impl.MetricsSystemImpl: Stopping s3a-file-system 
> metrics system...
> 19/02/12 05:33:14 INFO impl.MetricsSystemImpl: s3a-file-system metrics system 
> stopped.
> 19/02/12 05:33:14 INFO impl.MetricsSystemImpl: s3a-file-system metrics system 
> shutdown complete.
> {quote}
> The test failure is
> {quote}
> Error Message
> query_test/test_scanners.py:703: in test_resolution_by_name     
> use_db=unique_database) common/impala_test_suite.py:419: in run_test_case     
> check_call(cmd, shell=True) /usr/lib64/python2.7/subprocess.py:542: in 
> check_call     raise CalledProcessError(retcode, cmd) E   CalledProcessError: 
> Command 'hadoop fs -cp /test-warehouse/complextypestbl_parquet/nullable.parq 
> \ E   
> /test-warehouse/test_resolution_by_name_daec05d5.db/nested_resolution_by_name_test/
>  E   hadoop fs -cp /test-warehouse/complextypestbl_parquet/nonnullable.parq \ 
> E   
> /test-warehouse/test_resolution_by_name_daec05d5.db/nested_resolution_by_name_test/
>  E   ' returned non-zero exit status 1
> Stacktrace
> query_test/test_scanners.py:703: in test_resolution_by_name
>     use_db=unique_database)
> common/impala_test_suite.py:419: in run_test_case
>     check_call(cmd, shell=True)
> /usr/lib64/python2.7/subprocess.py:542: in check_call
>     raise CalledProcessError(retcode, cmd)
> E   CalledProcessError: Command 'hadoop fs -cp 
> /test-warehouse/complextypestbl_parquet/nullable.parq \
> E   
> /test-warehouse/test_resolution_by_name_daec05d5.db/nested_resolution_by_name_test/
> E   hadoop fs -cp /test-warehouse/complextypestbl_parquet/nonnullable.parq \
> E   
> /test-warehouse/test_resolution_by_name_daec05d5.db/nested_resolution_by_name_test/
> E   ' returned non-zero exit status 1
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to