Re: How to run pig on spark ?

2016-12-20 Thread Divya Gehlot
http://blog.cloudera.com/blog/2014/09/pig-is-flying-apache-pig-on-apache-spark/ Hope this helps Thanks, Divya On 21 December 2016 at 11:13, canan chen wrote: > I try to run pig on spark. But hit the following error. Could anyone help > me on that ? And BTW where can I find

Fwd: [Vote] : Spark-csv 1.3 + Spark 1.5.2 - Error parsing null values except String data type

2016-02-23 Thread Divya Gehlot
Hi, Please vote if you have faced this issue. I am getting error when parsing null values with Spark-csv DataFile : name age alice 35 bob null peter 24 Code : spark-shell --packages com.databricks:spark-csv_2.10:1.3.0 --master yarn-client -i /TestDivya/Spark/Testnull.scala Testnull.scala >

Re: Need help :Does anybody has HDP cluster on EC2?

2016-02-15 Thread Divya Gehlot
icMapReduce/latest/DeveloperGuide/emr-ssh-tunnel.html > > Regards > Sab > > On Mon, Feb 15, 2016 at 1:55 PM, Divya Gehlot <divya.htco...@gmail.com> > wrote: > >> Hi, >> I have hadoop cluster set up in EC2. >> I am unable to view application logs in Web UI a

Re: Need help :Does anybody has HDP cluster on EC2?

2016-02-15 Thread Divya Gehlot
-default.xml file to change it I guess. > > Thanks > Best Regards > > On Mon, Feb 15, 2016 at 1:55 PM, Divya Gehlot <divya.htco...@gmail.com> > wrote: > > > Hi, > > I have hadoop cluster set up in EC2. > > I am unable to view application logs in Web UI as

Need help :Does anybody has HDP cluster on EC2?

2016-02-15 Thread Divya Gehlot
Hi, I have hadoop cluster set up in EC2. I am unable to view application logs in Web UI as its taking internal IP Like below : http://ip-xxx-xx-xx-xxx.ap-southeast-1.compute.internal:8042 How can I change this to external one or

Pig Scripts -Performance and Tuning

2016-02-03 Thread Divya Gehlot
Hi, Is there any strategy to follow for performance tuning of pig scripts . Would really appreciate the pointers/guidance. Thanks, Divya

Re: Pig script to load C++ library

2016-02-02 Thread Divya Gehlot
@Shashikant -Check the link .. On 2 February 2016 at 22:52, Shashikant K wrote: > Hello, > > How can I read the return values from the external libraries back in PIG > script again? I

ERROR 2997: Encountered IOException. Could not find file to substitute parameters for

2016-02-01 Thread Divya Gehlot
When running pig script with -dryrun option ,getting Error 2997. *Pig script and parameter files,input files all are placed in hdfs location* . pig -param_file hdfs://xxx.xx.xx.xxx:8020/user/hdfs/pig/data/part-m-0--param "input_file=/user/hdfs/pig/data/input_file.txt" --param

pros and cons of using DBStorage for large data sets

2016-01-29 Thread Divya Gehlot
Hi, I have a use case where I need to store the processed data to Oracle DB using DBStorage. Has any one have experience storing large data sets to DB using DBStorage. Would like to know the pros and cons of using DBStorage for large data sets . Thanks, Divya

Difference between %declare, %default, define

2016-01-28 Thread Divya Gehlot
Hi, I would like to know the difference between %declare, %default, define keywords used in Pig Lartin. Thanks, Divya

Re: Read column data as tuple

2016-01-26 Thread Divya Gehlot
On 26 January 2016 at 15:59, Divya Gehlot <divya.htco...@gmail.com> wrote: > Hi Prashant , > Thanks for your solution. > But got stuck in one more issue . > How do I pass this tuple to group by function > > I tried passing it > getting below error > > grunt>

getting error in passing tuple to GroupBy dynamically

2016-01-25 Thread Divya Gehlot
Hi, I have two files Group_condition.txt Colun1|Y Column2|N Column3|Y Load_cfl = LOAD '/user/hdfs/file.txt' USING PigStorage('|') as (code:chararray,book_code:int,currency_code:chararray,start_date:datetime,end_date:datetime,type:chararray,amount:double ); Load_GroupBy = LOAD

Group by Dynamically

2016-01-25 Thread Divya Gehlot
Hi, I have two files File1 Group by Condition Field1 Y Field 2 N Field3 Y File2 is data file having field1,field2,field3 etc.. field1 field2 field3 field4 field5 data1 data2 data3 data4 data 5 data11 data22 data33 data44 data 55 Now my requirement is to group

Re: getting error in passing tuple to GroupBy dynamically

2016-01-25 Thread Divya Gehlot
roupBy_Condition" look like? I'm guessing you want to load that as a > bag/tuple before flattening, but you're reading as a chararray instead. > > On Monday, January 25, 2016, Divya Gehlot <divya.htco...@gmail.com> wrote: > > > Hi, > > I have two files &g

Read column data as tuple

2016-01-25 Thread Divya Gehlot
Hi, I have file data as below Data is dynamic Column1 | Y Column2 | N Column3 |Y Column4| Y Column5|N I need to filter the data which is Y and then read those columns as tuple so that I can pass to my Groupby function Filter data Column1,Y Column3,Y Column4,Y and then cnvert

Re: Unable to read the input file : Pig DBStorage to MySQL

2016-01-21 Thread Divya Gehlot
Found the resolution Refer the link below http://stackoverflow.com/a/34920051/4981746 On 21 January 2016 at 14:45, Divya Gehlot <divya.htco...@gmail.com> wrote: > am trying to store data to MySQL using DBStorage in pig script. when I > run the the script ,I am getting error un

Unable to read the input file : Pig DBStorage to MySQL

2016-01-20 Thread Divya Gehlot
am trying to store data to MySQL using DBStorage in pig script. when I run the the script ,I am getting error unable to read the input file. Whereas when I try to dump to store the same data file its working fine. Sample pigscript : %default DATABASE_HOST 'localhost';%default DATABASE_NAME

Store pig output to Oracle

2016-01-20 Thread Divya Gehlot
Hi, Would really appreciate if somebody can share pointers /example ,how to store pig script output to oracle database. Thanks, Divya

Store Pig outputdata to Oracle

2016-01-20 Thread Divya Gehlot
Hi, I am trying to store the Pig output data to Oracle. Script : REGISTER /TestDivya/Pig/piggybank-0.15.0.jar ; A = LOAD '/tmp/TestDivya/Pig/PigTest.txt' using PigStorage(',') as (name: chararray, age: int); STORE A INTO 'dummy' using

Query on Pig on Spark (Spork)

2015-11-24 Thread Divya Gehlot
> > Hi, As a beginner ,I have below queries on Spork(Pig on Spark). I have cloned git clone https://github.com/apache/pig -b spark . 1.On which version of Pig and Spark , Spork is being built. 2. I followed the steps mentioned in https://issues.apache.org/ jira/browse/PIG-4059 and try to run

Re: Query | Join Internals

2015-07-31 Thread Divya Gehlot
Hi Gagan, This link may help you https://bluewatersql.wordpress.com/2013/10/04/3-little-piggys-advanced-pig-join-scenarios/ On 30 July 2015 at 22:04, Alan Gates alanfga...@gmail.com wrote: Here's the original design doc: https://wiki.apache.org/pig/PigSkewedJoinSpec Alan. Gagan Juneja

RE: Loading data from a CSV file which has '\n' character in a field

2015-07-23 Thread Divya Gehlot
are not the intended recipient (or authorized to receive for the recipient), please contact the sender by reply email and delete all copies of this message. For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/index.html -Original Message- From: Divya Gehlot

Re: What is tutuple in pig ??

2015-07-23 Thread Divya Gehlot
A Pig relation is a bag of tuples. A Pig relation is similar to a table in a relational database, where the tuples in the bag correspond to the rows in a table. Unlike a relational table, however, Pig relations don't require that every tuple contain the same number of fields or that the fields in

Re: Loading data from a CSV file which has '\n' character in a field

2015-07-23 Thread Divya Gehlot
you can try this http://pig.apache.org/docs/r0.7.0/udf.html#Load%2FStore+Functions On 23 July 2015 at 09:24, Sunilmanohar Kancharlapalli -X (sunkanch - ZENSAR TECHNOLOGIES INC at Cisco) sunka...@cisco.com wrote: I am trying to load a csv file which has ā€˜\nā€™ character in the field and Pig is

run pig script through eclipse without hadoop

2015-07-21 Thread Divya Gehlot
Hi, Sorry for such a basic question but I am breaking my head to sort it out . I am trying to run pig script through eclipse without hadoop on windows . I am using Pigserver to run pig script . but I am facing build path issues. I added pig.jar

error wihile storing in JsonStorage

2015-07-15 Thread Divya Gehlot
I am running my Pigscript in Local mode in eclipse. when I try to store the output in JsonStorage. Exception in thread main java.lang.RuntimeException: Cannot instantiate:org.apache.pig.builtin.JsonStorage at org.apache.pig.impl.PigContext.instantiateFuncFromSpec(PigContext.java:473)

Re: error when using javscript udf

2015-07-13 Thread Divya Gehlot
it ? Additionally, you have just mentioned the LOAD statement in your pig script. Can you show how you registered your javascript UDF and also how you used your js UDF ? Regards, Debabrata On Tue, Jul 7, 2015 at 2:26 PM, Divya Gehlot divya.htco...@gmail.com wrote: Hi, I have sritten a javscript UDF

junk data in pig outputfile and debug Pig in local mode

2015-07-13 Thread Divya Gehlot
I am new to Apache Pig and I am trying to debug my Java UDF using PigServer API. Data file format component NIL 2015-07-12 18:58:55.74 E x.xxx..xxx 17 0xd3biz MESSAGE 00.00

Re: Setup debug mode in eclipse for Java UDF and pig script

2015-07-11 Thread Divya Gehlot
{ PigServer pig = new PigServer(ExecType.LOCAL); pig.registerScript(myscript.pig); } Best Regard, Jeff Zhang On 7/9/15, 7:50 PM, Divya Gehlot divya.htco...@gmail.com wrote: Hi, I am new to pig . Can somebody help me setting he debug mode in eclipse in easy

Java UDF Error: ERROR 1066: Unable to open iterator for alias

2015-07-10 Thread Divya Gehlot
Hi My input data format is (message,NIL,2015-07-01,22:58:53.66,E,machine.com.name,12,0xd6,String,String ,0,0.0,key=valuekey=123456789key=valuekey=USkey=COMPANYkey=MESSAGEkey=123456789key=Stringkey=StringKey=StringKey=String) I have written Java UDF as below to parse last string of input data

Setup debug mode in eclipse for Java UDF and pig script

2015-07-09 Thread Divya Gehlot
Hi, I am new to pig . Can somebody help me setting he debug mode in eclipse in easy steps. Would really appreciate the help. Thanks

error when using javscript udf

2015-07-07 Thread Divya Gehlot
Hi, I have sritten a javscript UDF and as mentioned in pig guide http://pig.apache.org/docs/r0.9.2/udf.html#udf-java .I have declared the outputschema in my js file parseString.outputSchema = dataMap:chararray; whereas same datatype is availble in my pig script too. A = LOAD '$input_file' AS(

does pig Javscript engine supports JSON formatting

2015-07-07 Thread Divya Gehlot
Hi, I have written a javscript UDF where I am converting the pig parameter to JSON and I am getting error like its does not recognize the JSON. Thanks,

how to write custom log loader and store in JSON format

2015-07-03 Thread Divya Gehlot
Hi, I am new to pig and I have a log file in below format (Message,NIL,2015-07-01,22:58:53.66,E,xx.xxx.x.xxx,12,0xd6,BIZ,Componentname,0,0.0,key_1=valueKEY_2=KEY_3=VALUEKEY_4=AUKEY_5=COMPANYKEY_6=VALUEKEY_7=1222KEY_8=VALUEKEY_9=VALUEKEY_10=VALUEKEY_10=VALUE) for which I need

parse the log file key value pairs using Pig

2015-07-03 Thread Divya Gehlot
I have log files(format show below) which needs to parsed and stored in JSON format (Stringname1,NIL,2015-07-01,22:58:53.66,E,stringname2,12,0xd6,BIZ,LevelMessage ,0,0.0,key=Valuekey=Valuekey=Valuekey=Valuekey=Valuekey=Valuekey=Valuekey=Valuekey=Valuekey=Valuekey=Value) I need to parse key