Rough guess would be using the client[1] you can get the Table and from there get the StorageDescriptor[2].
Something like: Path path = new Path(client.getTable(namespace, name).getSd().getLocation()); [1] - https://hive.apache.org/javadocs/r0.13.1/api/metastore/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.html#getTable(java.lang.String, java.lang.String) [2] - https://hive.apache.org/javadocs/r0.12.0/api/org/apache/hadoop/hive/metastore/api/StorageDescriptor.html On Fri, Jan 29, 2016 at 12:19 PM, Josh Wills <[email protected]> wrote: > I am sure there is a way to do it using the HS2 thrift APIs, but I've > never done it myself. > > On Fri, Jan 29, 2016 at 10:16 AM, Robinson, Landon - Landon < > [email protected]> wrote: > >> On this same note, I still have a similar problem to solve. >> I can point Crunch at an HDFS location and it will ingest/read the Orc >> file just fine. >> >> But is there a way (maybe levering Hcat/Hive apis) to get the file >> locations dynamically/from Hive? Can I ask Hcat/Hive about a table and its >> partitions, and it tell me the file location on HDFS (which I can then pass >> to Crunch to consume the file into the pipeline)? >> >> --------------------------------------------------------------------------- >> Landon Robinson >> Big Data & Hadoop Engineer >> IT Business Intelligence, Lowe’s Companies Inc. >> >> --------------------------------------------------------------------------- >> >> From: <Robinson>, LCI <[email protected]> >> Date: Friday, January 29, 2016 at 10:41 AM >> To: LCI <[email protected]>, Apache Crunch Mailing List < >> [email protected]>, David Ortiz <[email protected]> >> >> Subject: Re: Reading Hive Tables into PCollection >> >> *Solved:* >> >> Turns out you can use this: >> >> private HiveChar acl_idc; >> >> That comes from this package: org.apache.hadoop.hive.common.type.HiveChar; >> >> Sorry for all the emails, but hope the findings help someone else! >> >> >> --------------------------------------------------------------------------- >> Landon Robinson >> Big Data & Hadoop Engineer >> IT Business Intelligence, Lowe’s Companies Inc. >> >> --------------------------------------------------------------------------- >> >> From: <Robinson>, LCI <[email protected]> >> Date: Friday, January 29, 2016 at 10:36 AM >> To: Apache Crunch Mailing List <[email protected]>, LCI < >> [email protected]>, David Ortiz <[email protected]> >> Subject: Re: Reading Hive Tables into PCollection >> >> Additionally, we tried allowing those characters to be strings, but get >> the below error. The real issue is getting the Orc ‘char’ to cast to >> something we can use in the Orc structure. >> >> Exception in thread "main" org.apache.crunch.CrunchRuntimeException: >> Error while reading local file: file:/tmp/crunch-test/000000_0 >> at >> org.apache.crunch.io.orc.OrcFileReaderFactory$1.next(OrcFileReaderFactory.java:110) >> at >> org.apache.crunch.io.CompositePathIterable$2.next(CompositePathIterable.java:99) >> at com.google.common.collect.Iterators$5.next(Iterators.java:607) >> at com.google.common.collect.ImmutableList.copyOf(ImmutableList.java:266) >> at com.google.common.collect.ImmutableList.copyOf(ImmutableList.java:223) >> at >> org.apache.crunch.impl.mem.collect.MemCollection.<init>(MemCollection.java:79) >> at org.apache.crunch.impl.mem.MemPipeline.read(MemPipeline.java:165) >> at org.apache.crunch.impl.mem.MemPipeline.read(MemPipeline.java:156) >> at >> com.lowes.bigdata.closerate.verint.DataQualityDriverTest.run(DataQualityDriverTest.java:57) >> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) >> at >> com.lowes.bigdata.closerate.verint.DataQualityDriverTest.main(DataQualityDriverTest.java:36) >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> at >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) >> at >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >> at java.lang.reflect.Method.invoke(Method.java:606) >> at com.intellij.rt.execution.application.AppMain.main(AppMain.java:140) >> *Caused by: java.lang.ClassCastException: >> org.apache.hadoop.hive.serde2.io.HiveCharWritable cannot be cast to >> org.apache.hadoop.io.Text* >> at >> org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableStringObjectInspector.getPrimitiveJavaObject(WritableStringObjectInspector.java:46) >> at >> org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableStringObjectInspector.getPrimitiveJavaObject(WritableStringObjectInspector.java:26) >> at org.apache.crunch.types.orc.OrcUtils.convert(OrcUtils.java:169) >> at org.apache.crunch.types.orc.OrcUtils.convert(OrcUtils.java:222) >> at org.apache.crunch.types.orc.Orcs$ReflectInFn.map(Orcs.java:190) >> at org.apache.crunch.types.orc.Orcs$ReflectInFn.map(Orcs.java:168) >> at org.apache.crunch.fn.CompositeMapFn.map(CompositeMapFn.java:63) >> at >> org.apache.crunch.io.orc.OrcFileReaderFactory$1.next(OrcFileReaderFactory.java:108) >> ... 15 more >> >> *Verint1978Record* >> >> public class Verint1978Record { >> >> private String lct_nbr; >> private String vid_caa_id; >> private Integer hrs_nbr; >> private Integer mte_nbr; >> private String acl_idc; >> private Integer sec_dur; >> private Integer sec_to_pcs; >> private Integer sec_pcd; >> private String use_for_rpr_idc; >> private Integer grp_cnt; >> private Integer sng_cnt; >> private String upd_dt; >> private String upd_id; >> private String cal_dt; >> >> } >> >> >> >> >> --------------------------------------------------------------------------- >> Landon Robinson >> Big Data & Hadoop Engineer >> IT Business Intelligence, Lowe’s Companies Inc. >> >> --------------------------------------------------------------------------- >> >> From: <Robinson>, LCI <[email protected]> >> Reply-To: Apache Crunch Mailing List <[email protected]> >> Date: Friday, January 29, 2016 at 10:33 AM >> To: David Ortiz <[email protected]>, Apache Crunch Mailing List < >> [email protected]> >> Subject: Re: Reading Hive Tables into PCollection >> >> Right, we’ve been trying this with little luck — largely because I get >> the error: >> >> Caused by: java.lang.ClassCastException: >> org.apache.hadoop.hive.serde2.io.HiveCharWritable cannot be cast to >> org.apache.hadoop.hive.ql.io.orc.OrcStruct >> >> *Code:* >> >> OrcFileSource<Verint1978Record> source = new >> OrcFileSource<Verint1978Record>(new Path(inputPath), >> Orcs.reflects(Verint1978Record.class)); >> PCollection<Verint1978Record> persons = pipeline.read(source); >> >> *Verint1978Record* >> >> public class Verint1978Record { >> >> private String lct_nbr; >> private String vid_caa_id; >> private Integer hrs_nbr; >> private Integer mte_nbr; >> private Character acl_idc; >> private Integer sec_dur; >> private Integer sec_to_pcs; >> private Integer sec_pcd; >> private Character use_for_rpr_idc; >> private Integer grp_cnt; >> private Integer sng_cnt; >> private String upd_dt; >> private String upd_id; >> private String cal_dt; >> >> } >> >> >> --------------------------------------------------------------------------- >> Landon Robinson >> Big Data & Hadoop Engineer >> IT Business Intelligence, Lowe’s Companies Inc. >> >> --------------------------------------------------------------------------- >> >> From: David Ortiz <[email protected]> >> Date: Friday, January 29, 2016 at 10:19 AM >> To: LCI <[email protected]>, Apache Crunch Mailing List < >> [email protected]> >> Subject: Re: Reading Hive Tables into PCollection >> >> http://hortonworks.com/blog/using-orcfile-cascading-apache-crunch/ >> >> Here's the java excerpt from that article to read into Avro class (I'm >> assuming). >> >> [code language=”Java”] >> // Read an ORCFile using reflection-based serialization (slowest): >> OrcFileSource<Person> source = new OrcFileSource<Person>(new >> Path(inputPath), \ >> Orcs.reflection(Person.class)); >> PCollection<Person> persons = pipeline.read(source); >> >> On Fri, Jan 29, 2016 at 10:17 AM Robinson, Landon - Landon < >> [email protected]> wrote: >> >>> Orc format. >>> >>> --------------------------------------------------------------------------- >>> Landon Robinson >>> Big Data & Hadoop Engineer >>> IT Business Intelligence, Lowe’s Companies Inc. >>> >>> --------------------------------------------------------------------------- >>> >>> From: David Ortiz <[email protected]> >>> Reply-To: Apache Crunch Mailing List <[email protected]> >>> Date: Thursday, January 28, 2016 at 1:22 PM >>> To: Apache Crunch Mailing List <[email protected]> >>> Subject: Re: Reading Hive Tables into PCollection >>> >>> What format are they stored as? >>> >>> On Thu, Jan 28, 2016 at 1:20 PM Robinson, Landon - Landon < >>> [email protected]> wrote: >>> >>>> Crunch Gurus, >>>> >>>> What is the Crunch-convenient or recommended way to read the contents >>>> of a Hive table into a Pcollection? >>>> Thanks! >>>> Best, >>>> Landon >>>> >>>> --------------------------------------------------------------------------- >>>> Landon Robinson >>>> Big Data & Hadoop Engineer >>>> >>>> --------------------------------------------------------------------------- >>>> NOTICE: All information in and attached to the e-mails below may be >>>> proprietary, confidential, privileged and otherwise protected from improper >>>> or erroneous disclosure. If you are not the sender's intended recipient, >>>> you are not authorized to intercept, read, print, retain, copy, forward, or >>>> disseminate this message. If you have erroneously received this >>>> communication, please notify the sender immediately by phone >>>> (704-758-1000) or by e-mail and destroy all copies of this message >>>> electronic, paper, or otherwise. >>>> >>>> *By transmitting documents via this email: Users, Customers, Suppliers >>>> and Vendors collectively acknowledge and agree the transmittal of >>>> information via email is voluntary, is offered as a convenience, and is not >>>> a secured method of communication; Not to transmit any payment information >>>> E.G. credit card, debit card, checking account, wire transfer information, >>>> passwords, or sensitive and personal information E.G. Driver's license, >>>> DOB, social security, or any other information the user wishes to remain >>>> confidential; To transmit only non-confidential information such as plans, >>>> pictures and drawings and to assume all risk and liability for and >>>> indemnify Lowe's from any claims, losses or damages that may arise from the >>>> transmittal of documents or including non-confidential information in the >>>> body of an email transmittal. Thank you. * >>>> >>> NOTICE: All information in and attached to the e-mails below may be >>> proprietary, confidential, privileged and otherwise protected from improper >>> or erroneous disclosure. If you are not the sender's intended recipient, >>> you are not authorized to intercept, read, print, retain, copy, forward, or >>> disseminate this message. If you have erroneously received this >>> communication, please notify the sender immediately by phone >>> (704-758-1000) or by e-mail and destroy all copies of this message >>> electronic, paper, or otherwise. >>> >>> *By transmitting documents via this email: Users, Customers, Suppliers >>> and Vendors collectively acknowledge and agree the transmittal of >>> information via email is voluntary, is offered as a convenience, and is not >>> a secured method of communication; Not to transmit any payment information >>> E.G. credit card, debit card, checking account, wire transfer information, >>> passwords, or sensitive and personal information E.G. Driver's license, >>> DOB, social security, or any other information the user wishes to remain >>> confidential; To transmit only non-confidential information such as plans, >>> pictures and drawings and to assume all risk and liability for and >>> indemnify Lowe's from any claims, losses or damages that may arise from the >>> transmittal of documents or including non-confidential information in the >>> body of an email transmittal. Thank you. * >>> >> NOTICE: All information in and attached to the e-mails below may be >> proprietary, confidential, privileged and otherwise protected from improper >> or erroneous disclosure. If you are not the sender's intended recipient, >> you are not authorized to intercept, read, print, retain, copy, forward, or >> disseminate this message. If you have erroneously received this >> communication, please notify the sender immediately by phone >> (704-758-1000) or by e-mail and destroy all copies of this message >> electronic, paper, or otherwise. >> >> *By transmitting documents via this email: Users, Customers, Suppliers >> and Vendors collectively acknowledge and agree the transmittal of >> information via email is voluntary, is offered as a convenience, and is not >> a secured method of communication; Not to transmit any payment information >> E.G. credit card, debit card, checking account, wire transfer information, >> passwords, or sensitive and personal information E.G. Driver's license, >> DOB, social security, or any other information the user wishes to remain >> confidential; To transmit only non-confidential information such as plans, >> pictures and drawings and to assume all risk and liability for and >> indemnify Lowe's from any claims, losses or damages that may arise from the >> transmittal of documents or including non-confidential information in the >> body of an email transmittal. Thank you. * >> NOTICE: All information in and attached to the e-mails below may be >> proprietary, confidential, privileged and otherwise protected from improper >> or erroneous disclosure. If you are not the sender's intended recipient, >> you are not authorized to intercept, read, print, retain, copy, forward, or >> disseminate this message. If you have erroneously received this >> communication, please notify the sender immediately by phone >> (704-758-1000) or by e-mail and destroy all copies of this message >> electronic, paper, or otherwise. >> >> *By transmitting documents via this email: Users, Customers, Suppliers >> and Vendors collectively acknowledge and agree the transmittal of >> information via email is voluntary, is offered as a convenience, and is not >> a secured method of communication; Not to transmit any payment information >> E.G. credit card, debit card, checking account, wire transfer information, >> passwords, or sensitive and personal information E.G. Driver's license, >> DOB, social security, or any other information the user wishes to remain >> confidential; To transmit only non-confidential information such as plans, >> pictures and drawings and to assume all risk and liability for and >> indemnify Lowe's from any claims, losses or damages that may arise from the >> transmittal of documents or including non-confidential information in the >> body of an email transmittal. Thank you. * >> > >
