[
https://issues.apache.org/jira/browse/HADOOP-14142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15894203#comment-15894203
]
Steve Loughran commented on HADOOP-14142:
-----------------------------------------
Log. Vishnu: we like to have stacks and logs in a comment so it doesn't get
included with every email; use the \{noformat } or \{code} header and footer to
keep the text unformatted thanks
{code}
application/x-www-form-urlencoded; charset=utf-8
Thu, 02 Mar 2017 22:40:25 GMT
/myBkt8/"
17/03/02 14:40:25 DEBUG request: Sending Request: GET
https://webscaledemo.netapp.com:8082 /myBkt8/ Parameters: (max-keys: 1, prefix:
user/vardhan/, delimiter: /, ) Headers: (Authorization: AWS
2SNAJYEMQU45YPVYC89D:M8GbLXUuAJ2w5pGx4WJ6hJF3324=, User-Agent:
aws-sdk-java/1.7.4 Mac_OS_X/10.12.3
Java_HotSpot(TM)_64-Bit_Server_VM/25.60-b23/1.8.0_60, Date: Thu, 02 Mar 2017
22:40:25 GMT, Content-Type: application/x-www-form-urlencoded; charset=utf-8, )
17/03/02 14:40:25 DEBUG PoolingClientConnectionManager: Connection request:
[route: {s}->https://webscaledemo.netapp.com:8082][total kept alive: 0; route
allocated: 0 of 15; total allocated: 0 of 15]
17/03/02 14:40:25 DEBUG PoolingClientConnectionManager: Connection leased: [id:
10][route: {s}->https://webscaledemo.netapp.com:8082][total kept alive: 0;
route allocated: 1 of 15; total allocated: 1 of 15]
17/03/02 14:40:25 DEBUG DefaultClientConnectionOperator: Connecting to
webscaledemo.netapp.com:8082
17/03/02 14:40:25 DEBUG PoolingClientConnectionManager: Closing connections
idle longer than 60 SECONDS
17/03/02 14:40:25 DEBUG PoolingClientConnectionManager: Closing connections
idle longer than 60 SECONDS
17/03/02 14:40:26 DEBUG RequestAddCookies: CookieSpec selected: default
17/03/02 14:40:26 DEBUG RequestAuthCache: Auth cache not set in the context
17/03/02 14:40:26 DEBUG RequestProxyAuthentication: Proxy auth state:
UNCHALLENGED
17/03/02 14:40:26 DEBUG SdkHttpClient: Attempt 1 to execute request
17/03/02 14:40:26 DEBUG DefaultClientConnection: Sending request: GET
/myBkt8/?max-keys=1&prefix=user%2Fvardhan%2F&delimiter=%2F HTTP/1.1
17/03/02 14:40:26 DEBUG wire: >> "GET
/myBkt8/?max-keys=1&prefix=user%2Fvardhan%2F&delimiter=%2F HTTP/1.1[\r][\n]"
17/03/02 14:40:26 DEBUG wire: >> "Host: webscaledemo.netapp.com:8082[\r][\n]"
17/03/02 14:40:26 DEBUG wire: >> "Authorization: AWS
2SNAJYEMQU45YPVYC89D:M8GbLXUuAJ2w5pGx4WJ6hJF3324=[\r][\n]"
17/03/02 14:40:26 DEBUG wire: >> "User-Agent: aws-sdk-java/1.7.4
Mac_OS_X/10.12.3 Java_HotSpot(TM)_64-Bit_Server_VM/25.60-b23/1.8.0_60[\r][\n]"
17/03/02 14:40:26 DEBUG wire: >> "Date: Thu, 02 Mar 2017 22:40:25 GMT[\r][\n]"
17/03/02 14:40:26 DEBUG wire: >> "Content-Type:
application/x-www-form-urlencoded; charset=utf-8[\r][\n]"
17/03/02 14:40:26 DEBUG wire: >> "Connection: Keep-Alive[\r][\n]"
17/03/02 14:40:26 DEBUG wire: >> "[\r][\n]"
17/03/02 14:40:26 DEBUG headers: >> GET
/myBkt8/?max-keys=1&prefix=user%2Fvardhan%2F&delimiter=%2F HTTP/1.1
17/03/02 14:40:26 DEBUG headers: >> Host: webscaledemo.netapp.com:8082
17/03/02 14:40:26 DEBUG headers: >> Authorization: AWS
2SNAJYEMQU45YPVYC89D:M8GbLXUuAJ2w5pGx4WJ6hJF3324=
17/03/02 14:40:26 DEBUG headers: >> User-Agent: aws-sdk-java/1.7.4
Mac_OS_X/10.12.3 Java_HotSpot(TM)_64-Bit_Server_VM/25.60-b23/1.8.0_60
17/03/02 14:40:26 DEBUG headers: >> Date: Thu, 02 Mar 2017 22:40:25 GMT
17/03/02 14:40:26 DEBUG headers: >> Content-Type:
application/x-www-form-urlencoded; charset=utf-8
17/03/02 14:40:26 DEBUG headers: >> Connection: Keep-Alive
17/03/02 14:40:26 DEBUG wire: << "HTTP/1.1 200 OK[\r][\n]"
17/03/02 14:40:26 DEBUG wire: << "Date: Thu, 02 Mar 2017 22:40:26 GMT[\r][\n]"
17/03/02 14:40:26 DEBUG wire: << "Connection: KEEP-ALIVE[\r][\n]"
17/03/02 14:40:26 DEBUG wire: << "Server: StorageGRID/10.3.0.1[\r][\n]"
17/03/02 14:40:26 DEBUG wire: << "x-amz-request-id: 563477649[\r][\n]"
17/03/02 14:40:26 DEBUG wire: << "Content-Length: 266[\r][\n]"
17/03/02 14:40:26 DEBUG wire: << "Content-Type: application/xml[\r][\n]"
17/03/02 14:40:26 DEBUG wire: << "[\r][\n]"
17/03/02 14:40:26 DEBUG DefaultClientConnection: Receiving response: HTTP/1.1
200 OK
17/03/02 14:40:26 DEBUG headers: << HTTP/1.1 200 OK
17/03/02 14:40:26 DEBUG headers: << Date: Thu, 02 Mar 2017 22:40:26 GMT
17/03/02 14:40:26 DEBUG headers: << Connection: KEEP-ALIVE
17/03/02 14:40:26 DEBUG headers: << Server: StorageGRID/10.3.0.1
17/03/02 14:40:26 DEBUG headers: << x-amz-request-id: 563477649
17/03/02 14:40:26 DEBUG headers: << Content-Length: 266
17/03/02 14:40:26 DEBUG headers: << Content-Type: application/xml
17/03/02 14:40:26 DEBUG SdkHttpClient: Connection can be kept alive indefinitely
17/03/02 14:40:26 DEBUG XmlResponsesSaxParser: Sanitizing XML document destined
for handler class
com.amazonaws.services.s3.model.transform.XmlResponsesSaxParser$ListBucketHandler
17/03/02 14:40:26 DEBUG wire: << "<?xml version="1.0" encoding="UTF-8"?>[\n]"
17/03/02 14:40:26 DEBUG wire: << "<ListBucketResult
xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><Name>myBkt8</Name><Prefix>user/vardhan/</Prefix><Marker></Marker><MaxKeys>1</MaxKeys><Delimiter>/</Delimiter><IsTruncated>false</IsTruncated></ListBucketResult>"
17/03/02 14:40:26 DEBUG PoolingClientConnectionManager: Connection [id:
10][route: {s}->https://webscaledemo.netapp.com:8082] can be kept alive
indefinitely
17/03/02 14:40:26 DEBUG PoolingClientConnectionManager: Connection released:
[id: 10][route: {s}->https://webscaledemo.netapp.com:8082][total kept alive: 1;
route allocated: 1 of 15; total allocated: 1 of 15]
17/03/02 14:40:26 DEBUG XmlResponsesSaxParser: Parsing XML response document
with handler: class
com.amazonaws.services.s3.model.transform.XmlResponsesSaxParser$ListBucketHandler
17/03/02 14:40:26 DEBUG XmlResponsesSaxParser: Examining listing for bucket:
myBkt8
17/03/02 14:40:26 DEBUG request: Received successful response: 200, AWS Request
ID: 563477649
17/03/02 14:40:26 DEBUG S3AFileSystem: Not Found: s3a://myBkt8/user/vardhan
org.apache.hadoop.mapred.InvalidInputException: Input path does not exist:
s3a://myBkt8
at
org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:287)
at
org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:229)
at
org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:315)
at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:202)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:252)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:250)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:250)
at
org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:252)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:250)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:250)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1958)
at org.apache.spark.rdd.RDD.count(RDD.scala:1157)
... 53 elided
{code}
> S3A - Adding unexpected prefix
> ------------------------------
>
> Key: HADOOP-14142
> URL: https://issues.apache.org/jira/browse/HADOOP-14142
> Project: Hadoop Common
> Issue Type: Bug
> Reporter: Vishnu Vardhan
> Priority: Critical
>
> Hi:
> S3A seems to prefix unexpected prefix to my s3 path
> Specifically, in the debug log below the following line is unexpected
> > GET /myBkt8/?max-keys=1&prefix=user%2Fvardhan%2F&delimiter=%2F HTTP/1.1
> It is not clear where the "prefix" is coming from and why.
> I executed the following commands
> sc.setLogLevel("DEBUG")
> sc.hadoopConfiguration.set("fs.s3a.impl","org.apache.hadoop.fs.s3a.S3AFileSystem")
> sc.hadoopConfiguration.set("fs.s3a.endpoint","webscaledemo.netapp.com:8082")
> sc.hadoopConfiguration.set("fs.s3a.access.key","")
> sc.hadoopConfiguration.set("fs.s3a.secret.key","")
> sc.hadoopConfiguration.set("fs.s3a.path.style.access","false")
> val s3Rdd = sc.textFile("s3a://myBkt98")
> s3Rdd.count()
> ----
> debug log is below
> application/x-www-form-urlencoded; charset=utf-8
> Thu, 02 Mar 2017 22:40:25 GMT
> /myBkt8/"
> 17/03/02 14:40:25 DEBUG request: Sending Request: GET
> https://webscaledemo.netapp.com:8082 /myBkt8/ Parameters: (max-keys: 1,
> prefix: user/vardhan/, delimiter: /, ) Headers: (Authorization: AWS
> 2SNAJYEMQU45YPVYC89D:M8GbLXUuAJ2w5pGx4WJ6hJF3324=, User-Agent:
> aws-sdk-java/1.7.4 Mac_OS_X/10.12.3
> Java_HotSpot(TM)_64-Bit_Server_VM/25.60-b23/1.8.0_60, Date: Thu, 02 Mar 2017
> 22:40:25 GMT, Content-Type: application/x-www-form-urlencoded; charset=utf-8,
> )
> 17/03/02 14:40:25 DEBUG PoolingClientConnectionManager: Connection request:
> [route: {s}->https://webscaledemo.netapp.com:8082][total kept alive: 0; route
> allocated: 0 of 15; total allocated: 0 of 15]
> 17/03/02 14:40:25 DEBUG PoolingClientConnectionManager: Connection leased:
> [id: 10][route: {s}->https://webscaledemo.netapp.com:8082][total kept alive:
> 0; route allocated: 1 of 15; total allocated: 1 of 15]
> 17/03/02 14:40:25 DEBUG DefaultClientConnectionOperator: Connecting to
> webscaledemo.netapp.com:8082
> 17/03/02 14:40:25 DEBUG PoolingClientConnectionManager: Closing connections
> idle longer than 60 SECONDS
> 17/03/02 14:40:25 DEBUG PoolingClientConnectionManager: Closing connections
> idle longer than 60 SECONDS
> 17/03/02 14:40:26 DEBUG RequestAddCookies: CookieSpec selected: default
> 17/03/02 14:40:26 DEBUG RequestAuthCache: Auth cache not set in the context
> 17/03/02 14:40:26 DEBUG RequestProxyAuthentication: Proxy auth state:
> UNCHALLENGED
> 17/03/02 14:40:26 DEBUG SdkHttpClient: Attempt 1 to execute request
> 17/03/02 14:40:26 DEBUG DefaultClientConnection: Sending request: GET
> /myBkt8/?max-keys=1&prefix=user%2Fvardhan%2F&delimiter=%2F HTTP/1.1
> 17/03/02 14:40:26 DEBUG wire: >> "GET
> /myBkt8/?max-keys=1&prefix=user%2Fvardhan%2F&delimiter=%2F HTTP/1.1[\r][\n]"
> 17/03/02 14:40:26 DEBUG wire: >> "Host: webscaledemo.netapp.com:8082[\r][\n]"
> 17/03/02 14:40:26 DEBUG wire: >> "Authorization: AWS
> 2SNAJYEMQU45YPVYC89D:M8GbLXUuAJ2w5pGx4WJ6hJF3324=[\r][\n]"
> 17/03/02 14:40:26 DEBUG wire: >> "User-Agent: aws-sdk-java/1.7.4
> Mac_OS_X/10.12.3 Java_HotSpot(TM)_64-Bit_Server_VM/25.60-b23/1.8.0_60[\r][\n]"
> 17/03/02 14:40:26 DEBUG wire: >> "Date: Thu, 02 Mar 2017 22:40:25
> GMT[\r][\n]"
> 17/03/02 14:40:26 DEBUG wire: >> "Content-Type:
> application/x-www-form-urlencoded; charset=utf-8[\r][\n]"
> 17/03/02 14:40:26 DEBUG wire: >> "Connection: Keep-Alive[\r][\n]"
> 17/03/02 14:40:26 DEBUG wire: >> "[\r][\n]"
> 17/03/02 14:40:26 DEBUG headers: >> GET
> /myBkt8/?max-keys=1&prefix=user%2Fvardhan%2F&delimiter=%2F HTTP/1.1
> 17/03/02 14:40:26 DEBUG headers: >> Host: webscaledemo.netapp.com:8082
> 17/03/02 14:40:26 DEBUG headers: >> Authorization: AWS
> 2SNAJYEMQU45YPVYC89D:M8GbLXUuAJ2w5pGx4WJ6hJF3324=
> 17/03/02 14:40:26 DEBUG headers: >> User-Agent: aws-sdk-java/1.7.4
> Mac_OS_X/10.12.3 Java_HotSpot(TM)_64-Bit_Server_VM/25.60-b23/1.8.0_60
> 17/03/02 14:40:26 DEBUG headers: >> Date: Thu, 02 Mar 2017 22:40:25 GMT
> 17/03/02 14:40:26 DEBUG headers: >> Content-Type:
> application/x-www-form-urlencoded; charset=utf-8
> 17/03/02 14:40:26 DEBUG headers: >> Connection: Keep-Alive
> 17/03/02 14:40:26 DEBUG wire: << "HTTP/1.1 200 OK[\r][\n]"
> 17/03/02 14:40:26 DEBUG wire: << "Date: Thu, 02 Mar 2017 22:40:26
> GMT[\r][\n]"
> 17/03/02 14:40:26 DEBUG wire: << "Connection: KEEP-ALIVE[\r][\n]"
> 17/03/02 14:40:26 DEBUG wire: << "Server: StorageGRID/10.3.0.1[\r][\n]"
> 17/03/02 14:40:26 DEBUG wire: << "x-amz-request-id: 563477649[\r][\n]"
> 17/03/02 14:40:26 DEBUG wire: << "Content-Length: 266[\r][\n]"
> 17/03/02 14:40:26 DEBUG wire: << "Content-Type: application/xml[\r][\n]"
> 17/03/02 14:40:26 DEBUG wire: << "[\r][\n]"
> 17/03/02 14:40:26 DEBUG DefaultClientConnection: Receiving response: HTTP/1.1
> 200 OK
> 17/03/02 14:40:26 DEBUG headers: << HTTP/1.1 200 OK
> 17/03/02 14:40:26 DEBUG headers: << Date: Thu, 02 Mar 2017 22:40:26 GMT
> 17/03/02 14:40:26 DEBUG headers: << Connection: KEEP-ALIVE
> 17/03/02 14:40:26 DEBUG headers: << Server: StorageGRID/10.3.0.1
> 17/03/02 14:40:26 DEBUG headers: << x-amz-request-id: 563477649
> 17/03/02 14:40:26 DEBUG headers: << Content-Length: 266
> 17/03/02 14:40:26 DEBUG headers: << Content-Type: application/xml
> 17/03/02 14:40:26 DEBUG SdkHttpClient: Connection can be kept alive
> indefinitely
> 17/03/02 14:40:26 DEBUG XmlResponsesSaxParser: Sanitizing XML document
> destined for handler class
> com.amazonaws.services.s3.model.transform.XmlResponsesSaxParser$ListBucketHandler
> 17/03/02 14:40:26 DEBUG wire: << "<?xml version="1.0" encoding="UTF-8"?>[\n]"
> 17/03/02 14:40:26 DEBUG wire: << "<ListBucketResult
> xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><Name>myBkt8</Name><Prefix>user/vardhan/</Prefix><Marker></Marker><MaxKeys>1</MaxKeys><Delimiter>/</Delimiter><IsTruncated>false</IsTruncated></ListBucketResult>"
> 17/03/02 14:40:26 DEBUG PoolingClientConnectionManager: Connection [id:
> 10][route: {s}->https://webscaledemo.netapp.com:8082] can be kept alive
> indefinitely
> 17/03/02 14:40:26 DEBUG PoolingClientConnectionManager: Connection released:
> [id: 10][route: {s}->https://webscaledemo.netapp.com:8082][total kept alive:
> 1; route allocated: 1 of 15; total allocated: 1 of 15]
> 17/03/02 14:40:26 DEBUG XmlResponsesSaxParser: Parsing XML response document
> with handler: class
> com.amazonaws.services.s3.model.transform.XmlResponsesSaxParser$ListBucketHandler
> 17/03/02 14:40:26 DEBUG XmlResponsesSaxParser: Examining listing for bucket:
> myBkt8
> 17/03/02 14:40:26 DEBUG request: Received successful response: 200, AWS
> Request ID: 563477649
> 17/03/02 14:40:26 DEBUG S3AFileSystem: Not Found: s3a://myBkt8/user/vardhan
> org.apache.hadoop.mapred.InvalidInputException: Input path does not exist:
> s3a://myBkt8
> at
> org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:287)
> at
> org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:229)
> at
> org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:315)
> at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:202)
> at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:252)
> at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:250)
> at scala.Option.getOrElse(Option.scala:121)
> at org.apache.spark.rdd.RDD.partitions(RDD.scala:250)
> at
> org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
> at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:252)
> at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:250)
> at scala.Option.getOrElse(Option.scala:121)
> at org.apache.spark.rdd.RDD.partitions(RDD.scala:250)
> at org.apache.spark.SparkContext.runJob(SparkContext.scala:1958)
> at org.apache.spark.rdd.RDD.count(RDD.scala:1157)
> ... 53 elided
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]