Crunch Gurus,

Need some advice. I have experience writing Orc files in Crunch, and I can 
successfully read them in Crunch and print them out.
But when I attempt to process them with a DoFn, I get this error. What should I 
do?

Exception in thread "Thread-5" java.lang.NoSuchFieldError: 
HIVE_ORC_SPLIT_STRATEGY

Here’s my code:

        logger.info("Generating Hadoop Configuration...");
        Configuration crunchConf = getConf();
        logger.info("Establishing OrcFile Target for Final Output...");
        OrcFileTarget target = new OrcFileTarget(new Path(outputPath));
        //Establish Pipeline
        logger.info("Generating Crunch Map-Reduce Pipeline...");
        Pipeline pipeline = new MRPipeline(DataQualityDriver.class,crunchConf);

        //Establish OrcFileSource (emulates a Java class) linked to HDFS Path
        logger.info("Generating Orc File Source around given HDFS path...");

        OrcFileSource<Verint1978Record> orcsource = new 
OrcFileSource<Verint1978Record>(new Path(inputPath), 
Orcs.reflects(Verint1978Record.class));

//        Ingest the Orc File into a PCollection
        logger.info("Generating PCollection of Verint1978Record from Data...");
        PCollection<Verint1978Record> data = pipeline.read(orcsource);
//

        for (Verint1978Record record : data.materialize()){
                System.out.println(record.getAllColumns());
        }

//this all works fine until THIS point

        // can’t run these files through a DOFN or write them out without 
getting above error

        //this dofn simply reads the prev PCollection and prints it back out as 
a string (just to test the DOFN)

        PCollection<String> newData = 
data.parallelDo(DataQualityDoFns.DoFn_ProduceSameRecords(), 
Writables.strings());
                for (String record : newData.materialize()){
            System.out.println(record);
        }

PipelineResult result = pipeline.done();


DoFN (super lazy):

static DoFn<Verint1978Record, String> DoFn_ProduceSameRecords(){
    return new DoFn<Verint1978Record, String>() {
        @Override
        public void process(Verint1978Record input, Emitter<String> emitter) {

            emitter.emit(input.getLct_nbr() + "" + input.getVid_caa_id()+ "" + 
input.getHrs_nbr()+ "" + input.getMte_nbr()+ "" + input.getAcl_idc()+ "" + 
input.getSec_dur()+ "" + input.getSec_to_pcs()+ "" + input.getSec_pcd()+ "" + 
input.getUse_for_rpr_idc()+ "" + input.getGrp_cnt()+ "" + input.getSng_cnt()+ 
"" + input.getUpd_dt()+ "" + input.getUpd_id()+ "" + input.getCal_dt());

        }
    };
}

---------------------------------------------------------------------------
[cid:9719F25B-EBED-4C9D-A806-15698A326163]
Landon Robinson
Big Data & Hadoop Engineer
IT Business Intelligence, Lowe’s Companies Inc.
---------------------------------------------------------------------------

NOTICE: All information in and attached to the e-mails below may be 
proprietary, confidential, privileged and otherwise protected from improper or 
erroneous disclosure. If you are not the sender's intended recipient, you are 
not authorized to intercept, read, print, retain, copy, forward, or disseminate 
this message. If you have erroneously received this communication, please 
notify the sender immediately by phone (704-758-1000) or by e-mail and destroy 
all copies of this message electronic, paper, or otherwise.

By transmitting documents via this email: Users, Customers, Suppliers and 
Vendors collectively acknowledge and agree the transmittal of information via 
email is voluntary, is offered as a convenience, and is not a secured method of 
communication; Not to transmit any payment information E.G. credit card, debit 
card, checking account, wire transfer information, passwords, or sensitive and 
personal information E.G. Driver's license, DOB, social security, or any other 
information the user wishes to remain confidential; To transmit only 
non-confidential information such as plans, pictures and drawings and to assume 
all risk and liability for and indemnify Lowe's from any claims, losses or 
damages that may arise from the transmittal of documents or including 
non-confidential information in the body of an email transmittal. Thank you.

Reply via email to