Java's ZipFile does not work off an input stream so it cannot be used with HDFS. ZipInputStream can work with HDFS but its utility is limited by the fact that one cannot seek to random zip (for distributed processing) entries as in zipfile. Also Java's ZipFile implementation does not work on files > 4 GB.
There's a JIRA for this - https://issues.apache.org/jira/browse/MAPREDUCE-210 On 7/19/10 10:48 PM, "Mark Kerzner" <[email protected]> wrote: Hi, I want to pass a comment with my ZipEntry. I can put the comment in all right. However, when I read the comment from the ZipEntry back, it does not work if you use ZipInputStream. The comment is only read if you use ZipFile. On the other hand, HDFS FileSystem insists on using streams. I could copy the zip file from HDFS to local, but other than that, is there a way to use ZipFile with HDFS? Thank you, Mark
