Sergey Zhemzhitsky created CAMEL-8542:
-----------------------------------------
Summary: hdfs & hdfs2 components are merging data locally instead
of streaming it
Key: CAMEL-8542
URL: https://issues.apache.org/jira/browse/CAMEL-8542
Project: Camel
Issue Type: Improvement
Components: camel-hdfs
Affects Versions: 2.15.0
Reporter: Sergey Zhemzhitsky
Here is the
[conversation|http://camel.465427.n5.nabble.com/HDFS2-Component-and-NORMAL-FILE-type-td5764655.html]
CAMEL-4555 introduced an ability to merge files from within a single directory.
The merge operation is done locally, i.e. by means of creating the whole file
on the local file system (that may be space and time consuming in case of multi
-gigabyte, -terabyte files).
# It will be more efficient to stream these files directly from hdfs, for
example by wrapping them into
[SequenceInputStream|http://docs.oracle.com/javase/7/docs/api/java/io/SequenceInputStream.html]
or something like this
[MapReducePartInputStreamEnumeration|https://github.com/yahoo/Glimmer/blob/master/src/main/java/com/yahoo/glimmer/util/MapReducePartInputStreamEnumeration.java]
# It will be really great if there will be an ability to switch merging on and
off by means of an option or parameter.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)