Andy, thanks for your response. I've tried it again with your suggestion. still error (as below). seems like, need to solve "mahout class" dependency which used in VectorWritableConverter.
When I set-up elephant-bird, followed "https://github.com/kevinweil/elephant-bird" and completed quick-start and protocol-buffer, thrift 0.5 dependencies. so got path/to/build/elephant-bird-2.2.3-SNAPSHOT.jar in the pig code, register path/to/build/elephant-bird-2.2.3-SNAPSHOT.jar Should I set-up for mahout-class dependencies separately? Thanks! error message) Unexpected internal error. could not instantiate 'com.twitter.elephantbird.pig.load.SequenceFileLoader' with arguments '[-c com.twitter.elephantbird.pig.util.IntWritableConverter, -c com.twitter.elephantbird.pig.mahout.VectorWritableConverter -- -sparse]' Caused by: java.lang.NoClassDefFoundError: org/apache/mahout/math/Vector at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:247) at org.apache.pig.impl.PigContext.resolveClassName(PigContext.java:426) at com.twitter.elephantbird.pig.load.SequenceFileLoader.getWritableConverter(SequenceFileLoader.java:233) at com.twitter.elephantbird.pig.load.SequenceFileLoader.<init>(SequenceFileLoader.java:152) at com.twitter.elephantbird.pig.load.SequenceFileLoader.<init>(SequenceFileLoader.java:175) ... 21 more Caused by: java.lang.ClassNotFoundException: org.apache.mahout.math.Vector at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) On May 15, 2012, at 7:01 AM, Andy Schlaikjer wrote: > Yohan, that's a typo in VectorWritableConverter javadoc. I'll update today. > > The SequenceFileStorage and ...Loader classes are in separate packages: > > com.twitter.elephantbird.pig.*load*.SequenceFileLoader<https://github.com/kevinweil/elephant-bird/blob/master/src/java/com/twitter/elephantbird/pig/load/SequenceFileLoader.java> > com.twitter.elephantbird.pig.*store*.SequenceFileStorage<https://github.com/kevinweil/elephant-bird/blob/master/src/java/com/twitter/elephantbird/pig/store/SequenceFileStorage.java> > > Both of these classes rely on the > WritableConverter<https://github.com/kevinweil/elephant-bird/blob/master/src/java/com/twitter/elephantbird/pig/util/WritableConverter.java>interface. > They classload converters at runtime, given the classname of the > converters you'd like to use for key and value Writable instances. When > dealing with SequenceFile<IntWritable, VectorWritable> data, do this: > > {{{ > > %declare SEQFILE_LOADER > 'com.twitter.elephantbird.pig.load.SequenceFileLoader'; > %declare INT_CONVERTER > 'com.twitter.elephantbird.pig.util.IntWritableConverter'; > %declare VECTOR_CONVERTER > 'com.twitter.elephantbird.pig.mahout.VectorWritableConverter'; > > pair = LOAD '$INPUT_PATH' USING $SEQFILE_LOADER ( > '-c $INT_CONVERTER', > '-c $VECTOR_CONVERTER -- -sparse' > ); > > }}} > > Hope this helps! > > Andy > > > On Mon, May 14, 2012 at 11:57 PM, Ted Dunning <[email protected]> wrote: >> Sounds like a class path issue. >> >> Sent from my iPhone >> >> On May 15, 2012, at 2:43 AM, Yohan Chin <[email protected]> wrote: >> >>> >>> Hi, >>> Recently, I've tried to utilize elephant-bird for loading mahout result > into pig. >>> I could install elephant-bird and got .jar file. >>> and followed instructions as appears in below; (written by Andy > Schlaikjer) >>> > https://github.com/kevinweil/elephant-bird/blob/master/src/java/com/twitter/elephantbird/pig/mahout/VectorWritableConverter.java >>> ex) >>> pair = LOAD '$data' USING > com.twitter.elephantbird.pig.store.SequenceFileLoader ( >>> '-c $INT_CONVERTER', >>> '-c $VECTOR_CONVERTER -- -dense -cardinality 2' >>> ); >>> however, there is no sequenceFileLoader in store folder, and > load/sequencefileloader.java doesn't import > "com.twitter.elephantbird.pig.mahout.VectorWritableConverter" >>> >>> Is there any points I've missed? >>> >>> Thanks a lot for this awesome api! >>>
