Hello,
Using the attached simple program (orcrd.java), I am able to read the
column-types, column names & print the data-row, which is getting printed
as: {1, hello, orcFile}", from the attached ORC-file: "orcfile1".
However, when I try to get individual field-values, from this row (by
reading the row as "OrcRow"), I am getting the exception:
java.lang.ClassCastException: org.apache.hadoop.hive.ql.io.orc.OrcStruct
incompatible with orcrd$OrcRow
Could you please see the sample-program reading from the attached file
("orcfile1") & let me know, how to read individual "field-values" from
each-row.
Thanks,
Ravi
From: Ravi Tatapudi/India/IBM
To: [email protected]
Cc: Eric Jacobson/Worcester/IBM@IBMUS, Sumit Kumar6/India/IBM@IBMIN
Date: 12/01/2015 06:16 PM
Subject: Re: Facing issues while writing ORC files
Hello,
I could prepare test cases for ORC-write (for dynamic schema), using the
example provided at: https://gist.github.com/omalley/ccabae7cccac28f64812.
Now, I am trying to read the data from ORC-file (that is created as part
of the above example) using the attached "test-program", but, getting the
following exception:
"java.lang.ClassCastException:
org.apache.hadoop.hive.ql.io.orc.OrcStruct$OrcStructInspector incompatible
with OrcReader$OrcRowInspector":
[attachment "OrcReader.java" deleted by Ravi Tatapudi/India/IBM]
Could you please see the same & provide your inputs on why it is not
reading the data (or) if there is a corresponding "reader-example", that
reads the data written by the above "writer-example", I request you to
provide the same.
Thanks,
Ravi
From: "Owen O'Malley" <[email protected]>
To: [email protected]
Date: 11/25/2015 01:01 AM
Subject: Re: Facing issues while writing ORC files
Ok, the problem is that you need to create an ObjectInspector that
specifies the types of the columns. With a generic record, the
reflection-based ObjectInspector doesn't have enough information.
Unfortunately, it is kind of ugly, because there is a lot of boilerplate
code dealing with ObjectInspectors. The following code works by building
an ObjectInspector dynamically:
https://gist.github.com/omalley/ccabae7cccac28f64812
On the positive side, we've been working on updating the API as part of
separating ORC out to a separate project. In Hive 2.0 it should look like
the much simpler:
https://gist.github.com/omalley/7a53cb3ae91fa4c22023
.. Owen
On Mon, Nov 23, 2015 at 7:34 AM, Ravi Tatapudi <[email protected]>
wrote:
Hello,
I am Ravi Tatapudi, from IBM-India. I am working on a simple tool, that
writes data to ORC-file. I am new to "ORC/hive world" & I have prepared my
test-application, primarily based on the example-code at:
https://github.com/cloudera/hive/blob/cdh5.4.0-release/ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestOrcFile.java#L487-L508
I see that, I am able to write the data successfully to ORC-file, when the
column-definition is hard-coded in the class (orcw.java.sample1). However,
when I defined an array of obejcts & assign the values at run-time
(orcw.java.sample2), I see that, data is not written to the ORC-file.
Pl. find attached sample-programs:
Could you please see the same & provide your inputs on why
"orcw.java.sample2" is not writing data ?
Thanks,
Ravi
orcrd.java
Description: Binary data
orcfile1
Description: Binary data
