Does pig support referring previous field

2016-12-03 Thread Ryan
statement? Best, Ryan Xu

Re: Help with Pig UDF?

2014-12-06 Thread Ryan
variable is not necessary... a simple instance variable is just fine. On Fri Dec 05 2014 at 2:27:53 PM Ryan freelanceflashga...@gmail.com wrote: After running it with updated code, it seems like the problem has to do with something related to Tika since my output says that my input

Help with Pig UDF?

2014-12-05 Thread Ryan
, Ryan

Re: Help with Pig UDF?

2014-12-05 Thread Ryan
using a static variable inside the ExtractTextFromPDFs function to store the PdfParser once it has been initialized once? I'm still learning how to best do things within the Pig/MapReduce/Hadoop framework Ryan On Fri, Dec 5, 2014 at 1:35 PM, Ryan freelanceflashga...@gmail.com wrote: Thanks Pradeep

Re: Using PIG with complex SQL Statement

2014-10-31 Thread Ryan Prociuk
Vineet, Pig 0.12 supports the IN clause for filtering X = FILTER A BY (f1==8) OR (NOT (f2+f3 f1)) OR (f1 IN (9, 10, 11)); Ryan On Thu, Oct 30, 2014 at 11:09 PM, Vineet Mishra clearmido...@gmail.com wrote: Hi Dan, Thanks for your response, although FILTER cat_ids BY (category_id == 1

Re: Json Parsing in Apache Pig

2014-07-25 Thread Ryan Compton
I've found Twitter's elephantbird library very useful here (https://github.com/kevinweil/elephant-bird ) a = LOAD 'file3.json' USING com.twitter.elephantbird.pig.load.JsonLoader('-nestedLoad') Will parse the JSON into a map http://pig.apache.org/docs/r0.11.1/basic.html#map-schema the JSONArray

Re: Json Parsing in Apache Pig

2014-07-25 Thread Ryan Prociuk
') AS (json:map[]); B = FOREACH A GENERATE json#'col1' = col1; Ryan On Fri, Jul 25, 2014 at 4:55 PM, Satish Kolli feedwo...@gmail.com wrote: Did you try the standard JsonLoader? I didn't personally use it but it looks like you can specify the schema to extract/parse from your json

ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. org.slf4j.spi.LocationAwareLogger.log

2014-05-27 Thread Ryan Compton
Upgraded to 12.1 and now I'm getting this whenever I try to REGISTER a jar. I don't use slf4j, so I have no idea what's causing it. Has anyone else run into it? My Hadoop version is cdh3u3. Pig Stack Trace --- ERROR 2998: Unhandled internal error.

Re: ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. org.slf4j.spi.LocationAwareLogger.log

2014-05-27 Thread Ryan Compton
Update: Turns out I'm getting it on 11.1 as well. Must be a problem with something in my jar. On Tue, May 27, 2014 at 1:26 PM, Ryan Compton compton.r...@gmail.com wrote: Upgraded to 12.1 and now I'm getting this whenever I try to REGISTER a jar. I don't use slf4j, so I have no idea what's

Re: Need example of python code with dependency files

2013-11-09 Thread Ryan Compton
: Ryan Compton [mailto:compton.r...@gmail.com] Sent: Tuesday, November 05, 2013 6:40 PM To: user@pig.apache.org Subject: Need example of python code with dependency files I have some python code I'd like to deploy with a pig script. The .py code takes input from sys.stdin and outputs to sys.stdout

Need example of python code with dependency files

2013-11-05 Thread Ryan Compton
I have some python code I'd like to deploy with a pig script. The .py code takes input from sys.stdin and outputs to sys.stdout. It also needs some parameter files to run properly. The book Programming Pig tells me: The workaround for this is to create a TAR file and ship that, and then have a

Re: Reading json data

2013-10-22 Thread Ryan Compton
It sounds like you have two problems: parsing json and joining the datasets For reading jsons you can use: http://stackoverflow.com/questions/11035105/processing-json-through-pig-scripts/16501542#16501542 For matching the types you could filter for type1 then join against the data_dict_1 and

Re: Best practices for handling dependencies in Java UDFs?

2013-08-08 Thread Ryan Compton
I often do this, and then just register one giant .jar !-- Plugin to create a single jar that includes all dependencies -- plugin artifactIdmaven-assembly-plugin/artifactId version2.4/version

Re: Use Pig to parse JSON objects

2013-05-22 Thread Ryan Compton
I've been using twitter's elephantbird and have been very happy with it so far. Here's an example of parsing a nested json with it: json_eb = LOAD '$IN_DIRS' USING com.twitter.elephantbird.pig.load.JsonLoader('-nestedLoad') as (json:map[]); --parse json with twitter's library parsed0 = FOREACH

Streaming ERROR 2055: Received Error while processing the map plan:

2013-03-28 Thread Ryan Compton
I want to run a python script in a pig script. Here's the .py script: http://pastebin.com/JB26B7BE Here's the pig script: http://pastebin.com/JvD9t3Si Here's what happens: http://pastebin.com/4YjENb5q What could this be?

Failed to create DataStorage ?

2013-03-21 Thread Ryan Compton
I can start a grunt shell just fine: -bash-3.2$ pwd /home/rfcompton/Downloads/pig-0.11.0-src -bash-3.2$ ./bin/pig 2013-03-21 12:55:00,048 [main] INFO org.apache.pig.Main - Apache Pig version 0.11.1-SNAPSHOT (rexported) compiled Mar 21 2013, 12:49:21 2013-03-21 12:55:00,049 [main] INFO

Re: Failed to create DataStorage ?

2013-03-21 Thread Ryan Compton
-examples-0.20.2-cdh3u3.jar hadoop-test-0.20.2-cdh3u3.jar hadoop-tools-0.20.2-cdh3u3.jar but I still have the same problem. More info: http://pastebin.com/MfUHwu0X On Thu, Mar 21, 2013 at 1:16 PM, Prashant Kommireddi prash1...@gmail.com wrote: Hi Ryan, Seems like you are trying to connect to a hadoop

Re: Failed to create DataStorage ?

2013-03-21 Thread Ryan Compton
of pig -secretDebugCmd On Thu, Mar 21, 2013 at 1:29 PM, Ryan Compton compton.r...@gmail.comwrote: Hi, Hmm, I've got that much: -bash-3.2$ ls $HADOOP_HOME | grep cdh3u3 hadoop-0.20.2-cdh3u3-ant.jar hadoop-0.20.2-cdh3u3-core.jar hadoop-0.20.2-cdh3u3-examples.jar hadoop-0.20.2-cdh3u3

Re: Failed to create DataStorage ?

2013-03-21 Thread Ryan Compton
-withouthadoop.jar instead of /home/rfcompton/Downloads/pig-0.11.0-src/pig.jar and make sure 0.20.2 hadoop is on the classpath. On Thu, Mar 21, 2013 at 1:36 PM, Ryan Compton compton.r...@gmail.comwrote: -bash-3.2$ pig -secretDebugCmd Find hadoop at /usr/bin/hadoop dry run: HADOOP_CLASSPATH

Re: IllegalArgumentException: Not a host:port pair

2012-03-23 Thread Ryan Cole
The classpath for Pig, correct? Ryan On Fri, Mar 23, 2012 at 4:00 AM, Sam William sa...@stumbleupon.com wrote: Ryan, This message is specific to Hbase 0.92.1 . Make sure HBase 0.90.1 jar is not in the classpath before the 0.92.1 jar files Sam On Mar 22, 2012, at 8:20 PM, Ryan Cole

Re: IllegalArgumentException: Not a host:port pair

2012-03-23 Thread Ryan Cole
That was it! I don't think that I even had my HBase path on the PIG_CLASSPATH, at all. I simply put HBase on the path and now it works. Thank you, Ryan On Fri, Mar 23, 2012 at 10:02 AM, Ryan Cole r...@rycole.com wrote: The classpath for Pig, correct? Ryan On Fri, Mar 23, 2012 at 4:00 AM

IllegalArgumentException: Not a host:port pair

2012-03-22 Thread Ryan Cole
even the simplest query examples, using Pig, I get the following error: `ERROR 2017: Internal error creating job configuration.` and, the log file has the following more specific error: `Caused by: java.lang.IllegalArgumentException: Not a host:port pair: �^@^@^@^P8948@ryan-serverlocalhost

Re: IllegalArgumentException: Not a host:port pair

2012-03-22 Thread Ryan Cole
or not, though. Ryan On Mar 22, 2012, at 10:02 PM, Norbert Burger wrote: Actually on second glance, this seems like an issue not with the HBase config, but with the server:port info inside your .META. table. Have you tried LOADing from a different table besides events? From the HBase shell

Re: Pig Conditionals (Do I have to use UDFs)?

2011-09-14 Thread Ryan Hoegg
PigStorage(',') AS (item:chararray,number:int); MAPPED = JOIN EXAMPLE_SOURCE BY number LEFT OUTER, MAPPINGS BY number; PRETTY = FOREACH MAPPED GENERATE item, name; DUMP PRETTY; (a,one) (c,one) (a,two) (b,two) (d,three) (d,four) -- Ryan Hoegg On Wed, Sep 14, 2011 at 3:27 PM, Eli Finkelshteyn iefin

Re: Pig Conditionals (Do I have to use UDFs)?

2011-09-14 Thread Ryan Hoegg
,better) (d,4,better) -- Ryan Hoegg On Wed, Sep 14, 2011 at 4:24 PM, Eli Finkelshteyn iefin...@gmail.comwrote: Sorry, bad example, I guess. I want something I can do case statements with. In this case I could map instead, but if I wanted to use less straight-forward cases (i.e. one case where

Re: Problem with dependencies with fresh checkout from trunk

2011-09-09 Thread Ryan Hoegg
Is anyone familiar with getting the ivy dependencies to work on trunk? Thanks again, Ryan Hoegg On Thu, Sep 8, 2011 at 4:31 PM, Ryan Hoegg ryan.ho...@gmail.com wrote: Apache Ant(TM) version 1.8.2 compiled on June 3 2011 On Thu, Sep 8, 2011 at 4:12 PM, Daniel Dai da...@hortonworks.com wrote

Problem with dependencies with fresh checkout from trunk

2011-09-08 Thread Ryan Hoegg
: 'master'. It was required from org.apache.pig#Pig;0.10.0-SNAPSHOT compile [ivy:resolve] :: Am I doing something wrong? Thanks, Ryan Hoegg