Hello,

As part of writing data in ORC-file-format, I am doing the following:

=================================================================
1) Creating the "writer" object, as shown below:

         _writer = OrcFile.createWriter(new Path(_fileName), 
OrcFile.writerOptions(conf)
 .fileSystem(fs)
 .inspector(ObjInspector)
 .stripeSize(100000)
 .bufferSize(10000)
 .compress(CompressionKind.ZLIB)
 .version(OrcFile.Version.V_0_12));

2) Adding rows by executing the code: "_writer.addRow(_record)"

3) After writing all the input-rows, I am calling "_writer.close()"
=================================================================

This logic is working fine, when the file-size is small, but when the 
input-data is more than "100G", I am getting OOM error. As I understand, 
the "writer object" is flushing the data, only when we call 
"_writer.close()", hence the issue. I do not see any API such as "flush", 
that can be called on "writer object", after writing a portion of 
input-data. 

In this context, I am trying to understand, how to flush the data (during 
ORC-file-write), after processing some portion of the input-data & before 
calling "close" (which would be called, after processing all input rows). 
Could you please let me know your inputs, in this regard.

Thanks,
 Ravi

Reply via email to