Hi Zang any idea why is this happening? I can load ORC files created by
Hive table but I cant load ORC files created by Spark itself. It looks like
bug.

On Wed, Sep 30, 2015 at 12:03 PM, Umesh Kacha <umesh.ka...@gmail.com> wrote:

> Hi Zang thanks much please find the code below
>
> Working code loading data from a path created by Hive table using hive
> console outside of spark :
>
> DataFrame df =
> hiveContext.read().format("orc").load("/hdfs/path/to/hive/table/partition")
>
> Not working code inside spark hive tables created using hiveContext.sql
> insert into partition queries
>
> DataFrame df =
> hiveContext.read().format("orc").load("/hdfs/path/to/hive/table/partition/created/by/spark")
>
> You see above is same in both cases just second code is trying to load orc
> data created by Spark.
> On Sep 30, 2015 11:22 AM, "Zhan Zhang" <zzh...@hortonworks.com> wrote:
>
>> Hi Umesh,
>>
>> The potential reason is that Hive and Spark does not use same
>> OrcInputFormat. In new hive version, there are NewOrcInputFormat, but it is
>> not in spark because of backward compatibility (which is not available in
>> hive-0.12).
>> Do you mind post the code that works and not works for you?
>>
>> Thanks.
>>
>> Zhan Zhang
>>
>> On Sep 29, 2015, at 10:05 PM, Umesh Kacha <umesh.ka...@gmail.com> wrote:
>>
>> Hi I can read/load orc data created by hive table in a dataframe why is
>> it throwing Malformed ORC exception when I try to load data created by
>> hiveContext.sql into dataframe?
>> On Sep 30, 2015 2:37 AM, "Hortonworks" <zzh...@hortonworks.com> wrote:
>>
>>> You can try to use data frame for both read and write
>>>
>>> Thanks
>>>
>>> Zhan Zhang
>>>
>>>
>>> Sent from my iPhone
>>>
>>> On Sep 29, 2015, at 1:56 PM, Umesh Kacha <umesh.ka...@gmail.com> wrote:
>>>
>>> Hi Zang, thanks for the response. Table is created using Spark
>>> hiveContext.sql and data inserted into table also using hiveContext.sql.
>>> Insert into partition table. When I try to load orc data into dataframe I
>>> am loading particular partition data stored in path say
>>> /user/xyz/Hive/xyz.db/sparktable/partition1=abc
>>>
>>> Regards,
>>> Umesh
>>> On Sep 30, 2015 02:21, "Hortonworks" <zzh...@hortonworks.com> wrote:
>>>
>>>> How was the table is generated, by hive or by spark?
>>>>
>>>> If you generate table using have but read it by data frame, it may have
>>>> some comparability issue.
>>>>
>>>> Thanks
>>>>
>>>> Zhan Zhang
>>>>
>>>>
>>>> Sent from my iPhone
>>>>
>>>> > On Sep 29, 2015, at 1:47 PM, unk1102 <umesh.ka...@gmail.com> wrote:
>>>> >
>>>> > Hi I have a spark job which creates hive tables in orc format with
>>>> > partitions. It works well I can read data back into hive table using
>>>> hive
>>>> > console. But if I try further process orc files generated by Spark
>>>> job by
>>>> > loading into dataframe  then I get the following exception
>>>> > Caused by: java.io.IOException: Malformed ORC file
>>>> > hdfs://localhost:9000/user/hive/warehouse/partorc/part_tiny.txt.
>>>> Invalid
>>>> > postscript.
>>>> >
>>>> > Dataframe df = hiveContext.read().format("orc").load(to/path);
>>>> >
>>>> > Please guide.
>>>> >
>>>> >
>>>> >
>>>> > --
>>>> > View this message in context:
>>>> http://apache-spark-user-list.1001560.n3.nabble.com/Hive-ORC-Malformed-while-loading-into-spark-data-frame-tp24876.html
>>>> > Sent from the Apache Spark User List mailing list archive at
>>>> Nabble.com <http://nabble.com/>.
>>>> >
>>>> > ---------------------------------------------------------------------
>>>> > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>>>> > For additional commands, e-mail: user-h...@spark.apache.org
>>>> >
>>>> >
>>>>
>>>> --
>>>> CONFIDENTIALITY NOTICE
>>>> NOTICE: This message is intended for the use of the individual or
>>>> entity to
>>>> which it is addressed and may contain information that is confidential,
>>>> privileged and exempt from disclosure under applicable law. If the
>>>> reader
>>>> of this message is not the intended recipient, you are hereby notified
>>>> that
>>>> any printing, copying, dissemination, distribution, disclosure or
>>>> forwarding of this communication is strictly prohibited. If you have
>>>> received this communication in error, please contact the sender
>>>> immediately
>>>> and delete it from your system. Thank You.
>>>>
>>>
>>> CONFIDENTIALITY NOTICE
>>> NOTICE: This message is intended for the use of the individual or entity
>>> to which it is addressed and may contain information that is confidential,
>>> privileged and exempt from disclosure under applicable law. If the reader
>>> of this message is not the intended recipient, you are hereby notified that
>>> any printing, copying, dissemination, distribution, disclosure or
>>> forwarding of this communication is strictly prohibited. If you have
>>> received this communication in error, please contact the sender immediately
>>> and delete it from your system. Thank You.
>>
>>
>>

Reply via email to