[jira] Created: (PIG-1660) Consider passing result of COUNT/COUNT_STAR to LIMIT

2010-09-30 Thread Viraj Bhat (JIRA)
Reporter: Viraj Bhat Fix For: 0.9.0 In realistic scenarios we need to split a dataset into segments by using LIMIT, and like to achieve that goal within the same pig script. Here is a case: {code} A = load '$DATA' using PigStorage(',') as (id, pvs); B = group A by ALL; C

[jira] Created: (PIG-1630) Support param_files to be loaded into HDFS

2010-09-20 Thread Viraj Bhat (JIRA)
: Viraj Bhat I want to place the parameters of a Pig script in a param_file. But instead of this file being in the local file system where I run my java command, I want this to be on HDFS. {code} $ java -cp pig.jar org.apache.pig.Main -param_file hdfs://namenode/paramfile myscript.pig {code

[jira] Created: (PIG-1631) Support to 2 level nested foreach

2010-09-20 Thread Viraj Bhat (JIRA)
Support to 2 level nested foreach - Key: PIG-1631 URL: https://issues.apache.org/jira/browse/PIG-1631 Project: Pig Issue Type: New Feature Affects Versions: 0.7.0 Reporter: Viraj Bhat What I

[jira] Created: (PIG-1633) Using an alias withing Nested Foreach causes indeterminate behaviour

2010-09-20 Thread Viraj Bhat (JIRA)
Affects Versions: 0.7.0, 0.6.0, 0.5.0, 0.4.0 Reporter: Viraj Bhat I have created a RANDOMINT function which generates random numbers between (0 and specified value), For example RANDOMINT(4) gives random numbers between 0 and 3 (inclusive) {code} $hadoop fs -cat rand.dat f g h i j k l

[jira] Created: (PIG-1634) Multiple names for the group field

2010-09-20 Thread Viraj Bhat (JIRA)
, 0.1.0 Reporter: Viraj Bhat I am hoping that in Pig if I type {quote} c = cogroup a by foo, b by bar, the fields c.group, c.foo and c.bar should all map to c.$0 {quote} This would improve the readability of the Pig script. Here's a real usecase: {code} --- pages = LOAD 'pages.dat

[jira] Created: (PIG-1615) Return code from Pig is 0 even if the job fails when using -M flag

2010-09-16 Thread Viraj Bhat (JIRA)
Versions: 0.7.0, 0.6.0 Reporter: Viraj Bhat Fix For: 0.8.0 I have a Pig script of this form, which I used inside a workflow system such as Oozie. {code} A = load '$INPUT' using PigStorage(); store A into '$OUTPUT'; {code} I run this as with Multi-query optimization

[jira] Commented: (PIG-1615) Return code from Pig is 0 even if the job fails when using -M flag

2010-09-16 Thread Viraj Bhat (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12910414#action_12910414 ] Viraj Bhat commented on PIG-1615: - I tested this on Pig 0.8, but with a downloaded version

[jira] Updated: (PIG-282) Custom Partitioner

2010-09-15 Thread Viraj Bhat (JIRA)
[ https://issues.apache.org/jira/browse/PIG-282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Bhat updated PIG-282: --- Release Note: This feature allows to specify Hadoop Partitioner for the following operations: GROUP/COGROUP

[jira] Created: (PIG-1586) Parameter subsitution using -param option runs into problems when substituing entire pig statements in a shell script (maybe this is a bash problem)

2010-08-31 Thread Viraj Bhat (JIRA)
) Key: PIG-1586 URL: https://issues.apache.org/jira/browse/PIG-1586 Project: Pig Issue Type: Bug Affects Versions: 0.8.0 Reporter: Viraj Bhat I have a Pig script as a template: {code} register Countwords.jar; A = $INPUT; B

[jira] Updated: (PIG-1586) Parameter subsitution using -param option runs into problems when substituing entire pig statements in a shell script (maybe this is a bash problem)

2010-08-31 Thread Viraj Bhat (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Bhat updated PIG-1586: Description: I have a Pig script as a template: {code} register Countwords.jar; A = $INPUT; B = FOREACH

[jira] Created: (PIG-1576) Difference in Semantics between Load statement in Pig and HDFS client on Command line

2010-08-27 Thread Viraj Bhat (JIRA)
: Pig Issue Type: Bug Components: impl Affects Versions: 0.7.0, 0.6.0 Reporter: Viraj Bhat Here is my directory structure on HDFS which I want to access using Pig. This is a sample, but in real use case I have more than 100 of these directories. {code} $ hadoop fs

[jira] Created: (PIG-1561) XMLLoader in Piggybank does not support bz2 or gzip compressed XML files

2010-08-23 Thread Viraj Bhat (JIRA)
Components: impl Affects Versions: 0.7.0 Reporter: Viraj Bhat I have a simple Pig script which uses the XMLLoader after the Piggybank is built. {code} register piggybank.jar; A = load '/user/viraj/capacity-scheduler.xml.gz' using

[jira] Created: (PIG-1547) Piggybank MultiStorage does not scale when processing around 7k records per bucket

2010-08-17 Thread Viraj Bhat (JIRA)
Issue Type: Bug Affects Versions: 0.7.0 Reporter: Viraj Bhat I am trying to use the MultiStorage piggybank UDF {code} register pig-svn/trunk/contrib/piggybank/java/piggybank.jar; A = load '/user/viraj/largebucketinput.txt' using PigStorage('\u0001') as (a,b,c); STORE

[jira] Commented: (PIG-1537) Column pruner causes wrong results when using both Custom Store UDF and PigStorage

2010-08-05 Thread Viraj Bhat (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12895858#action_12895858 ] Viraj Bhat commented on PIG-1537: - Hi Olga, I have given the specific script with UDF's

[jira] Updated: (PIG-1537) Column pruner causes wrong results when using both Custom Store UDF and PigStorage

2010-08-04 Thread Viraj Bhat (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Bhat updated PIG-1537: Description: I have script which is of this pattern and it uses 2 StoreFunc's: {code} register loader.jar

[jira] Created: (PIG-1537) Column pruner causes wrong results when using both Custom Store UDF and PigStorage

2010-08-04 Thread Viraj Bhat (JIRA)
Issue Type: Bug Affects Versions: 0.7.0 Reporter: Viraj Bhat I have script which is of this pattern and it uses 2 StoreFunc's: {code} register loader.jar register piggy-bank/java/build/storage.jar; %DEFAULT OUTPUTDIR /user/viraj/prunecol/ ss_sc_0 = LOAD '/data/click

[jira] Commented: (PIG-1345) Link casting errors in POCast to actual lines numbers in Pig script

2010-05-06 Thread Viraj Bhat (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12864963#action_12864963 ] Viraj Bhat commented on PIG-1345: - Richard thanks for suggesting a workaround. The error

[jira] Commented: (PIG-798) Schema errors when using PigStorage and none when using BinStorage in FOREACH??

2010-04-26 Thread Viraj Bhat (JIRA)
[ https://issues.apache.org/jira/browse/PIG-798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12861097#action_12861097 ] Viraj Bhat commented on PIG-798: Hi Ashutosh, Yes that is possible, I know that we can do

[jira] Commented: (PIG-1211) Pig script runs half way after which it reports syntax error

2010-04-26 Thread Viraj Bhat (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12861106#action_12861106 ] Viraj Bhat commented on PIG-1211: - Ashutosh, yes as more and more people adopt Pig

[jira] Commented: (PIG-798) Schema errors when using PigStorage and none when using BinStorage in FOREACH??

2010-04-26 Thread Viraj Bhat (JIRA)
[ https://issues.apache.org/jira/browse/PIG-798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12861134#action_12861134 ] Viraj Bhat commented on PIG-798: Ashutosh thanks for clarifying, we will wait till that bug

[jira] Commented: (PIG-1345) Link casting errors in POCast to actual lines numbers in Pig script

2010-04-23 Thread Viraj Bhat (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12860397#action_12860397 ] Viraj Bhat commented on PIG-1345: - Which release will PIG:908 be fixed? Does it guarantee

[jira] Commented: (PIG-1211) Pig script runs half way after which it reports syntax error

2010-04-23 Thread Viraj Bhat (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12860419#action_12860419 ] Viraj Bhat commented on PIG-1211: - Ashutosh, I feel that the user may not be interested

[jira] Commented: (PIG-1339) International characters in column names not supported

2010-04-23 Thread Viraj Bhat (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12860445#action_12860445 ] Viraj Bhat commented on PIG-1339: - Hi Ashutosh this does not work in trunk. I am using

[jira] Commented: (PIG-798) Schema errors when using PigStorage and none when using BinStorage in FOREACH??

2010-04-23 Thread Viraj Bhat (JIRA)
[ https://issues.apache.org/jira/browse/PIG-798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12860452#action_12860452 ] Viraj Bhat commented on PIG-798: Hi Ashutosh, The problem here is not about using the data

[jira] Updated: (PIG-798) Schema errors when using PigStorage and none when using BinStorage in FOREACH??

2010-04-23 Thread Viraj Bhat (JIRA)
[ https://issues.apache.org/jira/browse/PIG-798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Bhat updated PIG-798: --- Affects Version/s: 0.6.0 0.5.0 0.4.0 0.3.0

[jira] Commented: (PIG-1378) har url not usable in Pig scripts

2010-04-21 Thread Viraj Bhat (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12859384#action_12859384 ] Viraj Bhat commented on PIG-1378: - har:// currently works in Pig 0.7 when the hdfs location

[jira] Created: (PIG-1378) har url not usable in Pig scripts

2010-04-14 Thread Viraj Bhat (JIRA)
: Viraj Bhat Fix For: 0.7.0 I am trying to use har (Hadoop Archives) in my Pig script. I can use them through the HDFS shell {noformat} $hadoop fs -ls 'har:///user/viraj/project/subproject/files/size/data' Found 1 items -rw--- 5 viraj users1537234 2010-04-14 09:49 user/viraj

[jira] Updated: (PIG-1378) har url not usable in Pig scripts

2010-04-14 Thread Viraj Bhat (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Bhat updated PIG-1378: Description: I am trying to use har (Hadoop Archives) in my Pig script. I can use them through the HDFS

[jira] Commented: (PIG-518) LOBinCond exception in LogicalPlanValidationExecutor when providing default values for bag

2010-04-14 Thread Viraj Bhat (JIRA)
[ https://issues.apache.org/jira/browse/PIG-518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12857157#action_12857157 ] Viraj Bhat commented on PIG-518: The above script generates the following error in Pig 0.7

[jira] Resolved: (PIG-518) LOBinCond exception in LogicalPlanValidationExecutor when providing default values for bag

2010-04-14 Thread Viraj Bhat (JIRA)
[ https://issues.apache.org/jira/browse/PIG-518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Bhat resolved PIG-518. Fix Version/s: 0.7.0 Resolution: Fixed LOBinCond exception in LogicalPlanValidationExecutor when

[jira] Resolved: (PIG-829) DECLARE statement stop processing after special characters such as dot . , + % etc..

2010-04-14 Thread Viraj Bhat (JIRA)
[ https://issues.apache.org/jira/browse/PIG-829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Bhat resolved PIG-829. Fix Version/s: 0.7.0 Resolution: Fixed Pig 0.7 yields the correct result. {code} x = LOAD 'something

[jira] Created: (PIG-1377) Pig/Zebra fails without proper error message when the mapred.jobtracker.maxtasks.per.job exceeds threshold

2010-04-13 Thread Viraj Bhat (JIRA)
/jira/browse/PIG-1377 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.6.0, 0.7.0 Reporter: Viraj Bhat I have a Zebra script which generates huge amount of mappers around 400K. The mapred.jobtracker.maxtasks.per.job is currently set

[jira] Created: (PIG-1374) Order by fails with java.lang.String cannot be cast to org.apache.pig.data.DataBag

2010-04-12 Thread Viraj Bhat (JIRA)
Issue Type: Bug Components: impl Affects Versions: 0.6.0, 0.7.0 Reporter: Viraj Bhat Script loads data from BinStorage(), then flattens columns and then sorts on the second column with order descending. The order by fails with the ClassCastException {code

[jira] Commented: (PIG-756) UDFs should have API for transparently opening and reading files from HDFS or from local file system with only relative path

2010-04-07 Thread Viraj Bhat (JIRA)
[ https://issues.apache.org/jira/browse/PIG-756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12854762#action_12854762 ] Viraj Bhat commented on PIG-756: In Pig 0.7 we have moved local mode of Pig to local mode

[jira] Resolved: (PIG-756) UDFs should have API for transparently opening and reading files from HDFS or from local file system with only relative path

2010-04-07 Thread Viraj Bhat (JIRA)
[ https://issues.apache.org/jira/browse/PIG-756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Bhat resolved PIG-756. Resolution: Fixed Fix Version/s: 0.7.0 https://issues.apache.org/jira/browse/PIG-1053 fixes this issue

[jira] Created: (PIG-1345) Link casting errors in POCast to actual lines numbers in Pig script

2010-03-31 Thread Viraj Bhat (JIRA)
Components: impl Affects Versions: 0.6.0 Reporter: Viraj Bhat For the purpose of easy debugging, I would be nice to find out where my warnings are coming from is in the pig script. The only known process is to comment out lines in the Pig script and see

[jira] Created: (PIG-1339) International characters in column names not supported

2010-03-30 Thread Viraj Bhat (JIRA)
Affects Versions: 0.6.0 Reporter: Viraj Bhat There is a particular use-case in which someone specifies a column name to be in International characters. {code} inputdata = load '/user/viraj/inputdata.txt' using PigStorage() as (あいうえお); describe inputdata; dump inputdata; {code

[jira] Created: (PIG-1341) Cannot convert DataByeArray to Chararray and results in FIELD_DISCARDED_TYPE_CONVERSION_FAILED 20

2010-03-30 Thread Viraj Bhat (JIRA)
Project: Pig Issue Type: Bug Affects Versions: 0.6.0 Reporter: Viraj Bhat Script reads in BinStorage data and tries to convert a column which is in DataByteArray to Chararray. {code} raw = load 'sampledata' using BinStorage() as (col1,col2, col3); --filter

[jira] Updated: (PIG-1341) Cannot convert DataByeArray to Chararray and results in FIELD_DISCARDED_TYPE_CONVERSION_FAILED

2010-03-30 Thread Viraj Bhat (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Bhat updated PIG-1341: Component/s: impl Summary: Cannot convert DataByeArray to Chararray and results

[jira] Created: (PIG-1343) pig_log file missing even though Main tells it is creating one and an M/R job fails

2010-03-30 Thread Viraj Bhat (JIRA)
Issue Type: Bug Components: impl Affects Versions: 0.6.0 Reporter: Viraj Bhat There is a particular case where I was running with the latest trunk of Pig. {code} $java -cp pig.jar:/home/path/hadoop20cluster org.apache.pig.Main testcase.pig [main] INFO

[jira] Created: (PIG-1308) Inifinite loop in JobClient when reading from BinStorage Message: [org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 2]

2010-03-18 Thread Viraj Bhat (JIRA)
] Key: PIG-1308 URL: https://issues.apache.org/jira/browse/PIG-1308 Project: Pig Issue Type: Bug Reporter: Viraj Bhat Fix For: 0.7.0 Simple script fails to read files from BinStorage() and fails

[jira] Updated: (PIG-1308) Inifinite loop in JobClient when reading from BinStorage Message: [org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 2]

2010-03-18 Thread Viraj Bhat (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Bhat updated PIG-1308: Description: Simple script fails to read files from BinStorage() and fails to submit jobs to JobTracker

[jira] Created: (PIG-1278) Type mismatch in key from map: expected org.apache.pig.impl.io.NullableFloatWritable, recieved org.apache.pig.impl.io.NullableText

2010-03-05 Thread Viraj Bhat (JIRA)
URL: https://issues.apache.org/jira/browse/PIG-1278 Project: Pig Issue Type: Bug Affects Versions: 0.6.0 Reporter: Viraj Bhat Fix For: 0.7.0 I have a script which uses Map data, and runs a UDF, which creates random numbers

[jira] Created: (PIG-1281) Detect org.apache.pig.data.DataByteArray cannot be cast to org.apache.pig.data.Tuple type of errors at Compile Type during creation of logical plan

2010-03-05 Thread Viraj Bhat (JIRA)
--- Key: PIG-1281 URL: https://issues.apache.org/jira/browse/PIG-1281 Project: Pig Issue Type: Improvement Affects Versions: 0.6.0 Reporter: Viraj Bhat Fix For: 0.8.0 This is more of an enhancement request, where we

[jira] Commented: (PIG-1252) Diamond splitter does not generate correct results when using Multi-query optimization

2010-03-02 Thread Viraj Bhat (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12840339#action_12840339 ] Viraj Bhat commented on PIG-1252: - A modified version of the script works, does this have

[jira] Created: (PIG-1272) Column pruner causes wrong results

2010-03-02 Thread Viraj Bhat (JIRA)
: Viraj Bhat Fix For: 0.7.0 For a simple script the column pruner optimization removes certain columns from the original relation, which results in wrong results. Input file kv contains the following columns (tab separated) {code} a 1 a 2 a 3 b 4 c 5 c

[jira] Commented: (PIG-1272) Column pruner causes wrong results

2010-03-02 Thread Viraj Bhat (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12840389#action_12840389 ] Viraj Bhat commented on PIG-1272: - Now with Pig 0.7 or trunk we have the following error

[jira] Created: (PIG-1263) Script producing varying number of records when COGROUPing value of map data type with and without types

2010-02-25 Thread Viraj Bhat (JIRA)
/browse/PIG-1263 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.6.0 Reporter: Viraj Bhat Fix For: 0.6.0 I have a Pig script which I am experimenting upon. [[Albeit this is not optimized and can be done in variety

[jira] Created: (PIG-1252) Diamond splitter does not generate correct results when using Multi-query optimization

2010-02-22 Thread Viraj Bhat (JIRA)
: Pig Issue Type: Bug Affects Versions: 0.6.0 Reporter: Viraj Bhat Fix For: 0.7.0 I have script which uses split but somehow does not use one of the split branch. The skeleton of the script is as follows {code} loadData = load '/user/viraj/zebradata' using

[jira] Updated: (PIG-1252) Diamond splitter does not generate correct results when using Multi-query optimization

2010-02-22 Thread Viraj Bhat (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Bhat updated PIG-1252: Description: I have script which uses split but somehow does not use one of the split branch. The skeleton

[jira] Created: (PIG-1247) Error Number makes it hard to debug: ERROR 2999: Unexpected internal error. org.apache.pig.backend.datastorage.DataStorageException cannot be cast to java.lang.Error

2010-02-19 Thread Viraj Bhat (JIRA)
- Key: PIG-1247 URL: https://issues.apache.org/jira/browse/PIG-1247 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.6.0 Reporter: Viraj Bhat Fix

[jira] Created: (PIG-1243) Passing Complex map types to and from streaming causes a problem

2010-02-18 Thread Viraj Bhat (JIRA)
Versions: 0.6.0 Reporter: Viraj Bhat Fix For: 0.7.0 I have a program which generates different types of Maps fields and stores it into PigStorage. {code} A = load '/user/viraj/three.txt' using PigStorage(); B = foreach A generate ['a'#'12'] as b:map[], ['b'#['c'#'12']] as c

[jira] Reopened: (PIG-1194) ERROR 2055: Received Error while processing the map plan

2010-02-10 Thread Viraj Bhat (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Bhat reopened PIG-1194: - Hi Richard, I ran the script attached on the ticket and found out that the map tasks fails

[jira] Commented: (PIG-1131) Pig simple join does not work when it contains empty lines

2010-02-08 Thread Viraj Bhat (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12831248#action_12831248 ] Viraj Bhat commented on PIG-1131: - Olga I marked it as critical since we mention that Pig can

[jira] Commented: (PIG-1131) Pig simple join does not work when it contains empty lines

2010-02-08 Thread Viraj Bhat (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12831251#action_12831251 ] Viraj Bhat commented on PIG-1131: - Ashutosh I was able to recreate a similar problem using

[jira] Created: (PIG-1220) Document unknown keywords as missing or to do in future

2010-02-03 Thread Viraj Bhat (JIRA)
: documentation Affects Versions: 0.6.0 Reporter: Viraj Bhat Fix For: 0.7.0 To get help at the grunt shell I do the following: grunttouchz 010-02-04 00:59:28,714 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1000: Error during parsing. Encountered IDENTIFIER touchz

[jira] Updated: (PIG-1174) Creation of output path should be done by storage function

2010-01-27 Thread Viraj Bhat (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Bhat updated PIG-1174: Fix Version/s: 0.7.0 Creation of output path should be done by storage function

[jira] Updated: (PIG-940) Cross site HDFS access using the default.fs.name not possible in Pig

2010-01-27 Thread Viraj Bhat (JIRA)
[ https://issues.apache.org/jira/browse/PIG-940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Bhat updated PIG-940: --- Affects Version/s: (was: 0.3.0) 0.5.0 Fix Version/s: 0.7.0 Cross site HDFS

[jira] Updated: (PIG-531) Way for explain to show 1 plan at a time

2010-01-27 Thread Viraj Bhat (JIRA)
[ https://issues.apache.org/jira/browse/PIG-531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Bhat updated PIG-531: --- Fix Version/s: 0.5.0 Hi Olga, I think we have a way to handle it in multi-query optimization

[jira] Created: (PIG-1194) ERROR 2055: Received Error while processing the map plan

2010-01-15 Thread Viraj Bhat (JIRA)
Affects Versions: 0.5.0, 0.6.0 Reporter: Viraj Bhat Assignee: Richard Ding Fix For: 0.6.0 Attachments: inputdata.txt I have a simple Pig script which takes 3 columns out of which one is null. {code} input = load 'inputdata.txt' using PigStorage

[jira] Updated: (PIG-1194) ERROR 2055: Received Error while processing the map plan

2010-01-15 Thread Viraj Bhat (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Bhat updated PIG-1194: Attachment: inputdata.txt Testdata to run with this script ERROR 2055: Received Error while processing

[jira] Commented: (PIG-1187) UTF-8 (international code) breaks with loader when load with schema is specified

2010-01-14 Thread Viraj Bhat (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12800315#action_12800315 ] Viraj Bhat commented on PIG-1187: - Hi Jeff, This is specific to the data we are using

[jira] Created: (PIG-1187) UTF-8 (international code) breaks with loader when load with schema is specified

2010-01-13 Thread Viraj Bhat (JIRA)
Issue Type: Bug Affects Versions: 0.6.0 Reporter: Viraj Bhat Fix For: 0.6.0 I have a set of Pig statements which dump an international dataset. {code} INPUT_OBJECT = load 'internationalcode'; describe INPUT_OBJECT; dump INPUT_OBJECT; {code} Sample output (756a6196

CFP for 24th International Conference on Supercomputing (ICS 2010, Tsukuba, Japan)

2010-01-05 Thread Viraj Bhat
Dear Hadoop and Pig Users, This is just to let you know that the submission deadline for ICS'10 ( http://www.ics-conference.org/) is two weeks from today. ICS is a premier forum for research in cloud/distributed computing and the most of the work/research we do in CCDI. The CFP of the

[jira] Created: (PIG-1157) Sucessive replicated joins do not generate Map Reduce plan and fails due to OOM

2009-12-16 Thread Viraj Bhat (JIRA)
Issue Type: Bug Components: impl Affects Versions: 0.6.0 Reporter: Viraj Bhat Fix For: 0.6.0 Hi all, I have a script which does 2 replicated joins in succession. Please note that the inputs do not exist on the HDFS. {code} A = LOAD '/tmp/abc' USING

[jira] Updated: (PIG-1157) Sucessive replicated joins do not generate Map Reduce plan and fails due to OOM

2009-12-16 Thread Viraj Bhat (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Bhat updated PIG-1157: Attachment: oomreplicatedjoin.pig replicatedjoinexplain.log Explain output and Pig script

[jira] Created: (PIG-1144) set default_parallelism construct does not set the number of reducers correctly

2009-12-09 Thread Viraj Bhat (JIRA)
Issue Type: Bug Components: impl Affects Versions: 0.7.0 Environment: Hadoop 20 cluster with multi-node installation Reporter: Viraj Bhat Fix For: 0.7.0 Hi all, I have a Pig script where I set the parallelism using the following set construct: set

[jira] Updated: (PIG-1144) set default_parallelism construct does not set the number of reducers correctly

2009-12-09 Thread Viraj Bhat (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Bhat updated PIG-1144: Attachment: brokenparallel.out genericscript_broken_parallel.pig Script and explain output

[jira] Commented: (PIG-1144) set default_parallelism construct does not set the number of reducers correctly

2009-12-09 Thread Viraj Bhat (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12788436#action_12788436 ] Viraj Bhat commented on PIG-1144: - This happens on the real cluster, where the sorting job

[jira] Commented: (PIG-1144) set default_parallelism construct does not set the number of reducers correctly

2009-12-09 Thread Viraj Bhat (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12788439#action_12788439 ] Viraj Bhat commented on PIG-1144: - Hi Daniel, One more thing to note is that the Last Sort M

[jira] Commented: (PIG-1144) set default_parallelism construct does not set the number of reducers correctly

2009-12-09 Thread Viraj Bhat (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12788481#action_12788481 ] Viraj Bhat commented on PIG-1144: - Hi Daniel, Thanks again for your input. This is more

[jira] Created: (PIG-1131) Pig simple join does not work when it contains empty lines

2009-12-07 Thread Viraj Bhat (JIRA)
Affects Versions: 0.7.0 Reporter: Viraj Bhat Priority: Critical Fix For: 0.7.0 I have a simple script, which does a JOIN. {code} input1 = load '/user/viraj/junk1.txt' using PigStorage(' '); describe input1; input2 = load '/user/viraj/junk2.txt' using

[jira] Updated: (PIG-1131) Pig simple join does not work when it contains empty lines

2009-12-07 Thread Viraj Bhat (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Bhat updated PIG-1131: Attachment: simplejoinscript.pig junk2.txt junk1.txt Dummy datasets and pig

[jira] Created: (PIG-1124) Unable to set Custom Job Name using the -Dmapred.job.name parameter

2009-12-03 Thread Viraj Bhat (JIRA)
Components: impl Affects Versions: 0.6.0 Reporter: Viraj Bhat Priority: Minor Fix For: 0.6.0 As a Hadoop user I want to control the Job name for my analysis via the command line using the following construct:: java -cp pig.jar:$HADOOP_HOME/conf

[jira] Created: (PIG-1101) Pig parser does not recognize its own data type in LIMIT statement

2009-11-20 Thread Viraj Bhat (JIRA)
Components: impl Affects Versions: 0.6.0 Reporter: Viraj Bhat Priority: Minor Fix For: 0.6.0 I have a Pig script in which I specify the number of records to limit as a long type. {code} A = LOAD '/user/viraj/echo.txt' AS (txt:chararray); B = LIMIT A 10L

[jira] Created: (PIG-1081) PigCookBook use of PARALLEL keyword

2009-11-10 Thread Viraj Bhat (JIRA)
Reporter: Viraj Bhat Fix For: 0.5.0 Hi all, I am looking at some tips for optimizing Pig programs (Pig Cookbook) using the PARALLEL keyword. http://hadoop.apache.org/pig/docs/r0.5.0/cookbook.html#Use+PARALLEL+Keyword We know that currently Pig 0.5 uses Hadoop 20 (as its default

[jira] Created: (PIG-1084) Pig CookBook documentation Take Advantage of Join Optimization additions:Merge and Skewed Join

2009-11-10 Thread Viraj Bhat (JIRA)
Project: Pig Issue Type: Bug Components: documentation Affects Versions: 0.6.0 Reporter: Viraj Bhat Fix For: 0.6.0 Hi all, We have a host of Join optimizations that have been implemented recently in Pig to improve performance

[jira] Commented: (PIG-1060) MultiQuery optimization throws error for multi-level splits

2009-11-04 Thread Viraj Bhat (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12773744#action_12773744 ] Viraj Bhat commented on PIG-1060: - Hi Ankur and Richard, I have a script which demonstrates

[jira] Created: (PIG-1064) Behvaiour of COGROUP with and without schema when using * operator

2009-10-29 Thread Viraj Bhat (JIRA)
Components: impl Affects Versions: 0.6.0 Reporter: Viraj Bhat Fix For: 0.6.0 I have 2 tab separated files, 1.txt and 2.txt $ cat 1.txt 1 2 2 3 $ cat 2.txt 1 2 2 3 I use COGROUP feature of Pig

[jira] Created: (PIG-1031) PigStorage interpreting chararray/bytearray for a tuple element inside a bag as float or double

2009-10-20 Thread Viraj Bhat (JIRA)
Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.5.0 Reporter: Viraj Bhat Fix For: 0.5.0, 0.6.0 I have a data stored in a text file as: {(4153E765)} {(AF533765)} I try reading it using PigStorage as: {code} A = load

[jira] Updated: (PIG-1031) PigStorage interpreting chararray/bytearray for a tuple element inside a bag as float or double

2009-10-20 Thread Viraj Bhat (JIRA)
[ https://issues.apache.org/jira/browse/PIG-1031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Bhat updated PIG-1031: Description: I have a data stored in a text file as: {(4153E765)} {(AF533765)} I try reading it using

[jira] Created: (PIG-978) ERROR 2100 (hdfs://localhost/tmp/temp175740929/tmp-1126214010 does not exist) and ERROR 2999: (Unexpected internal error. null) when using Multi-Query optimization

2009-09-25 Thread Viraj Bhat (JIRA)
--- Key: PIG-978 URL: https://issues.apache.org/jira/browse/PIG-978 Project: Pig Issue Type: Bug Affects Versions: 0.6.0 Reporter: Viraj Bhat Fix For: 0.6.0 I have Pig script

[jira] Created: (PIG-974) Issues with mv command when used after store when using -param_file/-param options

2009-09-23 Thread Viraj Bhat (JIRA)
Issue Type: Bug Affects Versions: 0.6.0 Environment: Hadoop 18 and 20 Reporter: Viraj Bhat Fix For: 0.6.0 Attachments: studenttab10k I have a Pig script which moves the final output to another HDFS directory to signal completion, so that another

[jira] Updated: (PIG-974) Issues with mv command when used after store when using -param_file/-param options

2009-09-23 Thread Viraj Bhat (JIRA)
[ https://issues.apache.org/jira/browse/PIG-974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Bhat updated PIG-974: --- Attachment: studenttab10k Testdata Issues with mv command when used after store when using -param_file/-param

[jira] Commented: (PIG-974) Issues with mv command when used after store when using -param_file/-param options

2009-09-23 Thread Viraj Bhat (JIRA)
[ https://issues.apache.org/jira/browse/PIG-974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12758962#action_12758962 ] Viraj Bhat commented on PIG-974: It turns out that the problem was due to single quotes. {code

[jira] Created: (PIG-940) Cross site HDFS access using the default.fs.name not possible in Pig

2009-08-31 Thread Viraj Bhat (JIRA)
Components: impl Affects Versions: 0.3.0 Environment: Hadoop 20 Reporter: Viraj Bhat Fix For: 0.3.0 I have a script which does the following.. access data from a remote HDFS location (via a HDFS installed at:hdfs://remotemachine1.company.com/ ) [[as I do

[jira] Commented: (PIG-940) Cross site HDFS access using the default.fs.name not possible in Pig

2009-08-31 Thread Viraj Bhat (JIRA)
[ https://issues.apache.org/jira/browse/PIG-940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12749722#action_12749722 ] Viraj Bhat commented on PIG-940: One important point to add: {code} localmachine.company.com

[jira] Created: (PIG-919) Type mismatch in key from map: expected org.apache.pig.impl.io.NullableBytesWritable, recieved org.apache.pig.impl.io.NullableText when doing simple group

2009-08-12 Thread Viraj Bhat (JIRA)
-- Key: PIG-919 URL: https://issues.apache.org/jira/browse/PIG-919 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.3.0 Reporter: Viraj Bhat Fix For: 0.3.0 I have a Pig

[jira] Commented: (PIG-919) Type mismatch in key from map: expected org.apache.pig.impl.io.NullableBytesWritable, recieved org.apache.pig.impl.io.NullableText when doing simple group

2009-08-12 Thread Viraj Bhat (JIRA)
[ https://issues.apache.org/jira/browse/PIG-919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12742668#action_12742668 ] Viraj Bhat commented on PIG-919: This problem can be solved simply by casting the firstname

[jira] Commented: (PIG-913) Error in Pig script when grouping on chararray column

2009-08-06 Thread Viraj Bhat (JIRA)
[ https://issues.apache.org/jira/browse/PIG-913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12740360#action_12740360 ] Viraj Bhat commented on PIG-913: The following works though.. {code} data = LOAD '/user/viraj

[jira] Created: (PIG-828) Problem accessing a tuple within a bag

2009-06-01 Thread Viraj Bhat (JIRA)
Reporter: Viraj Bhat Fix For: 0.3.0 Below pig script creates a tuple which contains 3 columns, 2 of which are chararray's and the third column is a bag of constant chararray. The script later projects the tuple within a bag. {code} a = load 'studenttab5' as (name, age, gpa); b

[jira] Updated: (PIG-828) Problem accessing a tuple within a bag

2009-06-01 Thread Viraj Bhat (JIRA)
[ https://issues.apache.org/jira/browse/PIG-828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Bhat updated PIG-828: --- Attachment: tupleacc.pig studenttab5 Input script and data. Problem accessing a tuple within

[jira] Created: (PIG-816) PigStorage() does not accept Unicode characters in its contructor

2009-05-22 Thread Viraj Bhat (JIRA)
Components: impl Affects Versions: 0.3.0 Reporter: Viraj Bhat Priority: Critical Fix For: 0.3.0 Simple Pig script which uses Unicode characters in the PigStorage() constructor fails with the following error: {code} studenttab = LOAD '/user/viraj

[jira] Updated: (PIG-816) PigStorage() does not accept Unicode characters in its contructor

2009-05-22 Thread Viraj Bhat (JIRA)
[ https://issues.apache.org/jira/browse/PIG-816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Bhat updated PIG-816: --- Attachment: pig_1243043613713.log Log file for detailed error message PigStorage() does not accept Unicode

[jira] Commented: (PIG-656) Use of eval word in the package hierarchy of a UDF causes parse exception

2009-05-19 Thread Viraj Bhat (JIRA)
[ https://issues.apache.org/jira/browse/PIG-656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12710862#action_12710862 ] Viraj Bhat commented on PIG-656: Another pig parse issue when a udf was defined within

[jira] Reopened: (PIG-656) Use of eval word in the package hierarchy of a UDF causes parse exception

2009-05-19 Thread Viraj Bhat (JIRA)
[ https://issues.apache.org/jira/browse/PIG-656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Bhat reopened PIG-656: Documentation should be updated on the eval keyword and what it actually does otherwise the user can be lost

[jira] Updated: (PIG-656) Use of eval or any other keyword in the package hierarchy of a UDF causes parse exception

2009-05-19 Thread Viraj Bhat (JIRA)
[ https://issues.apache.org/jira/browse/PIG-656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Bhat updated PIG-656: --- Summary: Use of eval or any other keyword in the package hierarchy of a UDF causes parse exception (was: Use

[jira] Created: (PIG-812) COUNT(*) does not work

2009-05-19 Thread Viraj Bhat (JIRA)
COUNT(*) does not work --- Key: PIG-812 URL: https://issues.apache.org/jira/browse/PIG-812 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.2.0 Reporter: Viraj Bhat

[jira] Updated: (PIG-812) COUNT(*) does not work

2009-05-19 Thread Viraj Bhat (JIRA)
[ https://issues.apache.org/jira/browse/PIG-812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Bhat updated PIG-812: --- Attachment: studenttab10k Input file COUNT(*) does not work --- Key

[jira] Commented: (PIG-774) Pig does not handle Chinese characters (in both the parameter subsitution using -param_file or embedded in the Pig script) correctly

2009-05-18 Thread Viraj Bhat (JIRA)
[ https://issues.apache.org/jira/browse/PIG-774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12710619#action_12710619 ] Viraj Bhat commented on PIG-774: Hi Daniel, For this patch to work, is it important to set

  1   2   >