[ 
https://issues.apache.org/jira/browse/SQOOP-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boglarka Egyed updated SQOOP-3243:
----------------------------------
    Attachment:     (was: SQOOP-3243.patch)

> Importing BLOB data causes "Stream closed" error on encrypted HDFS
> ------------------------------------------------------------------
>
>                 Key: SQOOP-3243
>                 URL: https://issues.apache.org/jira/browse/SQOOP-3243
>             Project: Sqoop
>          Issue Type: Bug
>    Affects Versions: 1.4.6
>            Reporter: Boglarka Egyed
>            Assignee: Boglarka Egyed
>         Attachments: SQOOP-3243.patch
>
>
> Importing BLOB data into encrypted zone causes "Stream closed" error with
> * BLOB data size bigger than 16MB -> LobFile will be used
> * Java 8 -> has a different implementation of the close() method of 
> FilterOutputStream class than Java 7
> Exception and stack trace:
> {noformat}
> 17/10/12 07:16:04 INFO mapreduce.Job: Running job: job_1507777811520_5091
> 17/10/12 07:16:13 INFO mapreduce.Job: Job job_1507777811520_5091 running in 
> uber mode : false
> 17/10/12 07:16:13 INFO mapreduce.Job: map 0% reduce 0%
> 17/10/12 07:22:37 INFO mapreduce.Job: Task Id : 
> attempt_1507777811520_5091_m_000000_0, Status : FAILED
> Error: java.io.IOException: Stream closed
> at 
> org.apache.hadoop.crypto.CryptoOutputStream.checkStream(CryptoOutputStream.java:268)
> at 
> org.apache.hadoop.crypto.CryptoOutputStream.flush(CryptoOutputStream.java:255)
> at java.io.FilterOutputStream.flush(FilterOutputStream.java:140)
> at java.io.DataOutputStream.flush(DataOutputStream.java:123)
> at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:141)
> at java.io.FilterOutputStream.close(FilterOutputStream.java:158)
> at 
> org.apache.commons.io.output.ProxyOutputStream.close(ProxyOutputStream.java:117)
> at org.apache.sqoop.io.LobFile$V0Writer.close(LobFile.java:1669)
> at org.apache.sqoop.lib.LargeObjectLoader.close(LargeObjectLoader.java:96)
> at 
> org.apache.sqoop.mapreduce.AvroImportMapper.cleanup(AvroImportMapper.java:79)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:148)
> at 
> org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> {noformat}
> The root cause of this issue is in LobFile.close method, which is being 
> invoked from the Map cleanup. In line 1669, from the stacktrace, it's trying 
> to close countingOut OS. However, at line 1664, out OS is already being 
> closed. However, out OS is just a wrapper of countingOut OS, so at the end, 
> both are pointing to same instance of CryptoOutputStream. When the call 
> reaches line 1669, CryptoOutputStream instance is already closed by line 
> 1664. The problem happens because java.io.BufferedOutputStream will try to 
> call flush on the underlying OS it's wrapping (in this case, 
> CryptoOutputStream), reaching line 255 of CryptoOutputStream.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to