[jira] [Commented] (PIG-3231) Problems with pig (TRUNK, 0.11) after upgrading to CDH4.2(yarn) using avro input
[ https://issues.apache.org/jira/browse/PIG-3231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13592072#comment-13592072 ] Tobias Schlottke commented on PIG-3231: --- We've been using the current trunk of avrostorage already. I switched on ignoreBadFiles, now I've got a similar exception elsewhere: {code} 2013-03-04 09:40:03,341 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : org.apache.pig.backend.executionengine.ExecException: ERROR 2135: Received error fr om store function.Filesystem closed at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStore.getNext(POStore.java:165) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POSplit.runPipeline(POSplit.java:254) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POSplit.processPlan(POSplit.java:236) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POSplit.processPlan(POSplit.java:241) at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POSplit.getNext(POSplit.java:228) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:283) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:278) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:756) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:338) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:157) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:416) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:152) Caused by: java.io.IOException: Filesystem closed at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:552) at org.apache.hadoop.hdfs.DFSOutputStream.writeChunk(DFSOutputStream.java:1406) at org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunk(FSOutputSummer.java:161) at org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:136) at org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:125) at org.apache.hadoop.fs.FSOutputSummer.write1(FSOutputSummer.java:116) at org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:90) at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:54) at java.io.DataOutputStream.write(DataOutputStream.java:107) at java.io.DataOutputStream.writeUTF(DataOutputStream.java:401) at java.io.DataOutputStream.writeUTF(DataOutputStream.java:323) at org.apache.pig.data.utils.SedesHelper.writeChararray(SedesHelper.java:66) at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:543) at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:435) at org.apache.pig.data.utils.SedesHelper.writeGenericTuple(SedesHelper.java:135) at org.apache.pig.data.BinInterSedes.writeTuple(BinInterSedes.java:613) at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:443) at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:435) at org.apache.pig.impl.io.InterRecordWriter.write(InterRecordWriter.java:73) at org.apache.pig.impl.io.InterStorage.putNext(InterStorage.java:87) {code} Any ideas? Problems with pig (TRUNK, 0.11) after upgrading to CDH4.2(yarn) using avro input Key: PIG-3231 URL: https://issues.apache.org/jira/browse/PIG-3231 Project: Pig Issue Type: Bug Affects Versions: 0.11 Environment: CDH4.2, yarn, avro Reporter: Tobias Schlottke Hi there, we've got a strange issue after switching to a new cluster with cdh4.2 (from cdh3): Pig seems to create temporary avro files for its map reduce jobs, which it either deletes or never creates. Pig fails with the no error returned by hadoop-message, but in nn-logs I found something interesting. The actual exception from nn-log is: a {code} 2013-03-01 12:59:30,858 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 8020, call org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from 192.168.1.28:37814: error: org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on
[jira] [Commented] (PIG-3231) Problems with pig (TRUNK, 0.11) after upgrading to CDH4.2(yarn) using avro input
[ https://issues.apache.org/jira/browse/PIG-3231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13592081#comment-13592081 ] Cheolsoo Park commented on PIG-3231: Enabling 'ignoreBadFiles' doesn't help even if you use the trunk version. That option doesn't handle all the possible IOExceptions in AvroStorage. The patch I worked on in PIG-3059 is meant to catch all the possible IOExceptions in any LoadFunc implementations, but that is NOT committed to trunk/CDH4.2. Problems with pig (TRUNK, 0.11) after upgrading to CDH4.2(yarn) using avro input Key: PIG-3231 URL: https://issues.apache.org/jira/browse/PIG-3231 Project: Pig Issue Type: Bug Affects Versions: 0.11 Environment: CDH4.2, yarn, avro Reporter: Tobias Schlottke Hi there, we've got a strange issue after switching to a new cluster with cdh4.2 (from cdh3): Pig seems to create temporary avro files for its map reduce jobs, which it either deletes or never creates. Pig fails with the no error returned by hadoop-message, but in nn-logs I found something interesting. The actual exception from nn-log is: a {code} 2013-03-01 12:59:30,858 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 8020, call org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from 192.168.1.28:37814: error: org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on /user/metrigo/event_logger/compact_log/2013/01/14/_temporary/1/_temporary/attempt_1362133122980_0017_m_07_0/part-m-7.avro File does not exist. Holder DFSClient_attempt_1362133122980_0017_m_07_0_1992466008_1 does not have any open files. org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on /user/metrigo/event_logger/compact_log/2013/01/14/_temporary/1/_temporary/attempt_1362133122980_0017_m_07_0/part-m-7.avro File does not exist. Holder DFSClient_attempt_1362133122980_0017_m_07_0_1992466008_1 does not have any open files. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2396) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2387) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2183) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:481) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:297) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44080) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1695) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1691) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:416) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1689) {code} Please note that we're analyzing a bunch of files (~200 files, we're using glob matchers), some of them are small. We made it work once without the small files. *Update* I found the following exception deep in the logs that seems to make the job fail: {code} 2013-03-03 19:51:06,169 ERROR [main] org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:metrigo (auth:SIMPLE) cause:java.io.IOException: org.apache.avro.AvroRuntimeException: java.io.IOException: Filesystem closed 2013-03-03 19:51:06,170 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.io.IOException: org.apache.avro.AvroRuntimeException: java.io.IOException: Filesystem closed at org.apache.pig.piggybank.storage.avro.AvroStorage.getNext(AvroStorage.java:357) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:211) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:526) at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80) at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:756)
[jira] [Resolved] (PIG-3148) OutOfMemory exception while spilling stale DefaultDataBag. Extra option to gc() before spilling large bag.
[ https://issues.apache.org/jira/browse/PIG-3148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy resolved PIG-3148. - Resolution: Fixed Fix Version/s: 0.11.1 0.12 Committed to 0.11.1 and trunk. Thanks Koji and Dmitriy. OutOfMemory exception while spilling stale DefaultDataBag. Extra option to gc() before spilling large bag. -- Key: PIG-3148 URL: https://issues.apache.org/jira/browse/PIG-3148 Project: Pig Issue Type: Improvement Components: impl Reporter: Koji Noguchi Assignee: Koji Noguchi Fix For: 0.12, 0.11.1 Attachments: pig-3148-v01.patch, pig-3148-v02.patch Our user reported that one of their jobs in pig 0.10 occasionally failed with 'Error: GC overhead limit exceeded' or 'Error: Java heap space', but rerunning it sometimes finishes successfully. For 1G heap reducer, heap dump showed it contained two huge DefaultDataBag with 300-400MBytes each when failing with OOM. Jstack at the time of OOM always showed that spill was running. {noformat} Low Memory Detector daemon prio=10 tid=0xb9c11800 nid=0xa52 runnable [0xb9afc000] java.lang.Thread.State: RUNNABLE at java.io.FileOutputStream.writeBytes(Native Method) at java.io.FileOutputStream.write(FileOutputStream.java:260) at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65) at java.io.BufferedOutputStream.write(BufferedOutputStream.java:109) - locked 0xe57c6390 (a java.io.BufferedOutputStream) at java.io.DataOutputStream.write(DataOutputStream.java:90) - locked 0xe57c60b8 (a java.io.DataOutputStream) at java.io.FilterOutputStream.write(FilterOutputStream.java:80) at org.apache.pig.data.utils.SedesHelper.writeBytes(SedesHelper.java:46) at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:537) at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:435) at org.apache.pig.data.utils.SedesHelper.writeGenericTuple(SedesHelper.java:135) at org.apache.pig.data.BinInterSedes.writeTuple(BinInterSedes.java:613) at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:443) at org.apache.pig.data.DefaultDataBag.spill(DefaultDataBag.java:106) - locked 0xceb16190 (a java.util.ArrayList) at org.apache.pig.impl.util.SpillableMemoryManager.handleNotification(SpillableMemoryManager.java:243) - locked 0xbeb86318 (a java.util.LinkedList) at sun.management.NotificationEmitterSupport.sendNotification(NotificationEmitterSupport.java:138) at sun.management.MemoryImpl.createNotification(MemoryImpl.java:171) at sun.management.MemoryPoolImpl$PoolSensor.triggerAction(MemoryPoolImpl.java:272) at sun.management.Sensor.trigger(Sensor.java:120) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3231) Problems with pig (TRUNK, 0.11) after upgrading to CDH4.2(yarn) using avro input
[ https://issues.apache.org/jira/browse/PIG-3231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13592105#comment-13592105 ] Rohini Palaniswamy commented on PIG-3231: - bq. It looks like the Avro file that Pig is trying to loadi doesn't exist on hdfs (or is corrupted). I have seen a similar issue when filename is changed between when Pig launches a job on front-end and when the job runs on back-end. You will not get a FileSystem closed exception for that. It will be a FileNotFoundException. If the file was renamed when it was being accessed, you will get a Lease Exception. This exception happens when some code has closed the filesystem and another piece of code has reference to the same FileSystem object (because of the FileSystem cache). A quick glance at AvroStorage does not have a fs.close() though. I am suspecting fs.close() is introduced in the pig code somewhere or most likely the user UDF is doing a fs.close(). Tobias, If you are using a UDF, please check if you are doing a fs.close() somewhere? If not you can workaround till this is fixed by setting fs.hdfs.impl.disable.cache to true. Problems with pig (TRUNK, 0.11) after upgrading to CDH4.2(yarn) using avro input Key: PIG-3231 URL: https://issues.apache.org/jira/browse/PIG-3231 Project: Pig Issue Type: Bug Affects Versions: 0.11 Environment: CDH4.2, yarn, avro Reporter: Tobias Schlottke Hi there, we've got a strange issue after switching to a new cluster with cdh4.2 (from cdh3): Pig seems to create temporary avro files for its map reduce jobs, which it either deletes or never creates. Pig fails with the no error returned by hadoop-message, but in nn-logs I found something interesting. The actual exception from nn-log is: a {code} 2013-03-01 12:59:30,858 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 8020, call org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from 192.168.1.28:37814: error: org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on /user/metrigo/event_logger/compact_log/2013/01/14/_temporary/1/_temporary/attempt_1362133122980_0017_m_07_0/part-m-7.avro File does not exist. Holder DFSClient_attempt_1362133122980_0017_m_07_0_1992466008_1 does not have any open files. org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on /user/metrigo/event_logger/compact_log/2013/01/14/_temporary/1/_temporary/attempt_1362133122980_0017_m_07_0/part-m-7.avro File does not exist. Holder DFSClient_attempt_1362133122980_0017_m_07_0_1992466008_1 does not have any open files. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2396) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2387) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2183) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:481) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:297) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44080) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1695) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1691) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:416) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1689) {code} Please note that we're analyzing a bunch of files (~200 files, we're using glob matchers), some of them are small. We made it work once without the small files. *Update* I found the following exception deep in the logs that seems to make the job fail: {code} 2013-03-03 19:51:06,169 ERROR [main] org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:metrigo (auth:SIMPLE) cause:java.io.IOException: org.apache.avro.AvroRuntimeException: java.io.IOException: Filesystem closed 2013-03-03 19:51:06,170 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.io.IOException: org.apache.avro.AvroRuntimeException: java.io.IOException: Filesystem closed at
[jira] [Commented] (PIG-3231) Problems with pig (TRUNK, 0.11) after upgrading to CDH4.2(yarn) using avro input
[ https://issues.apache.org/jira/browse/PIG-3231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13592132#comment-13592132 ] Tobias Schlottke commented on PIG-3231: --- Could it be something like filesystem limits aswell? We rebooted the whole cluster for the first time after the installation. Now it seems to fail in reducers with this exception: {code} 2013-03-04 11:57:02,214 ERROR [main] org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:metrigo (auth:SIMPLE) cause:org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle in InMemoryMerger - Thread to merge in-memory shuffled map-outputs 2013-03-04 11:57:02,215 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle in InMemoryMerger - Thread to merge in-memory shuffled map-outputs at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:121) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:379) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:157) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:416) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:152) Caused by: java.lang.ClassCastException: org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl$CompressAwarePath cannot be cast to java.lang.Comparable at java.util.TreeMap.put(TreeMap.java:559) at java.util.TreeSet.add(TreeSet.java:255) at org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl.closeOnDiskFile(MergeManagerImpl.java:340) at org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl$InMemoryMerger.merge(MergeManagerImpl.java:495) at org.apache.hadoop.mapreduce.task.reduce.MergeThread.run(MergeThread.java:94) {code} Which leads us to this Issue: https://issues.apache.org/jira/browse/MAPREDUCE-4965 4.2.0 seems to introduce this, we're now patching mapreduce and giving it another spin. Problems with pig (TRUNK, 0.11) after upgrading to CDH4.2(yarn) using avro input Key: PIG-3231 URL: https://issues.apache.org/jira/browse/PIG-3231 Project: Pig Issue Type: Bug Affects Versions: 0.11 Environment: CDH4.2, yarn, avro Reporter: Tobias Schlottke Hi there, we've got a strange issue after switching to a new cluster with cdh4.2 (from cdh3): Pig seems to create temporary avro files for its map reduce jobs, which it either deletes or never creates. Pig fails with the no error returned by hadoop-message, but in nn-logs I found something interesting. The actual exception from nn-log is: a {code} 2013-03-01 12:59:30,858 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 8020, call org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from 192.168.1.28:37814: error: org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on /user/metrigo/event_logger/compact_log/2013/01/14/_temporary/1/_temporary/attempt_1362133122980_0017_m_07_0/part-m-7.avro File does not exist. Holder DFSClient_attempt_1362133122980_0017_m_07_0_1992466008_1 does not have any open files. org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on /user/metrigo/event_logger/compact_log/2013/01/14/_temporary/1/_temporary/attempt_1362133122980_0017_m_07_0/part-m-7.avro File does not exist. Holder DFSClient_attempt_1362133122980_0017_m_07_0_1992466008_1 does not have any open files. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2396) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2387) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2183) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:481) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:297) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44080) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1695) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1691)
[jira] [Commented] (PIG-3231) Problems with pig (TRUNK, 0.11) after upgrading to CDH4.2(yarn) using avro input
[ https://issues.apache.org/jira/browse/PIG-3231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13592199#comment-13592199 ] Tobias Schlottke commented on PIG-3231: --- Patching that worked like a charm. We'll see if the error still persists for any of our workflows. Problems with pig (TRUNK, 0.11) after upgrading to CDH4.2(yarn) using avro input Key: PIG-3231 URL: https://issues.apache.org/jira/browse/PIG-3231 Project: Pig Issue Type: Bug Affects Versions: 0.11 Environment: CDH4.2, yarn, avro Reporter: Tobias Schlottke Hi there, we've got a strange issue after switching to a new cluster with cdh4.2 (from cdh3): Pig seems to create temporary avro files for its map reduce jobs, which it either deletes or never creates. Pig fails with the no error returned by hadoop-message, but in nn-logs I found something interesting. The actual exception from nn-log is: a {code} 2013-03-01 12:59:30,858 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 8020, call org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from 192.168.1.28:37814: error: org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on /user/metrigo/event_logger/compact_log/2013/01/14/_temporary/1/_temporary/attempt_1362133122980_0017_m_07_0/part-m-7.avro File does not exist. Holder DFSClient_attempt_1362133122980_0017_m_07_0_1992466008_1 does not have any open files. org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on /user/metrigo/event_logger/compact_log/2013/01/14/_temporary/1/_temporary/attempt_1362133122980_0017_m_07_0/part-m-7.avro File does not exist. Holder DFSClient_attempt_1362133122980_0017_m_07_0_1992466008_1 does not have any open files. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2396) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2387) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2183) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:481) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:297) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44080) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1695) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1691) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:416) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1689) {code} Please note that we're analyzing a bunch of files (~200 files, we're using glob matchers), some of them are small. We made it work once without the small files. *Update* I found the following exception deep in the logs that seems to make the job fail: {code} 2013-03-03 19:51:06,169 ERROR [main] org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:metrigo (auth:SIMPLE) cause:java.io.IOException: org.apache.avro.AvroRuntimeException: java.io.IOException: Filesystem closed 2013-03-03 19:51:06,170 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.io.IOException: org.apache.avro.AvroRuntimeException: java.io.IOException: Filesystem closed at org.apache.pig.piggybank.storage.avro.AvroStorage.getNext(AvroStorage.java:357) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:211) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:526) at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80) at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:756) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:338) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:157) at java.security.AccessController.doPrivileged(Native Method)
[jira] [Updated] (PIG-3136) Introduce a syntax making declared aliases optional
[ https://issues.apache.org/jira/browse/PIG-3136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Coveney updated PIG-3136: -- Attachment: PIG-3136-3.patch I've updated the RB and this patch with your comments, Cheolsoo. Let me know what you think? Introduce a syntax making declared aliases optional --- Key: PIG-3136 URL: https://issues.apache.org/jira/browse/PIG-3136 Project: Pig Issue Type: Improvement Reporter: Jonathan Coveney Assignee: Jonathan Coveney Fix For: 0.12 Attachments: PIG-3136-0.patch, PIG-3136-1.patch, PIG-3136-2.patch, PIG-3136-3.patch This is something Daniel and I have talked about before, and now that we have the @ syntax, this is easy to implement. The idea is that relation names are no longer required, and you can instead use a fat arrow (obviously that can be changed) to signify this. The benefit is not having to engage in the mental load of having to name everything. One other possibility is just making alias = optional. I fear that that could be a little TOO magical, but I welcome opinions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3211) Allow default Load/Store funcs to be configurable
[ https://issues.apache.org/jira/browse/PIG-3211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13592257#comment-13592257 ] Jonathan Coveney commented on PIG-3211: --- I'll take a closer look in a little bit, though I will say that there is a PigConfiguration singleton I'd like to see catch on. Ideally, it should be the central place for configurations like this. Allow default Load/Store funcs to be configurable - Key: PIG-3211 URL: https://issues.apache.org/jira/browse/PIG-3211 Project: Pig Issue Type: New Feature Affects Versions: 0.12 Reporter: Prashant Kommireddi Assignee: Prashant Kommireddi Fix For: 0.12 Attachments: PIG-3211.patch PigStorage is used by default when a Load/StoreFunc is not specified. It would be useful to make this configurable. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (PIG-3232) Refactor Pig so that configurations use PigConfiguration wherever possible
Jonathan Coveney created PIG-3232: - Summary: Refactor Pig so that configurations use PigConfiguration wherever possible Key: PIG-3232 URL: https://issues.apache.org/jira/browse/PIG-3232 Project: Pig Issue Type: Improvement Reporter: Jonathan Coveney Fix For: 0.12 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3144) Erroneous map entry alias resolution leading to Duplicate schema alias errors
[ https://issues.apache.org/jira/browse/PIG-3144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Coveney updated PIG-3144: -- Attachment: PIG-3144-1.patch Updated. Let me know how the tests come back. Thanks, Cheolsoo! Erroneous map entry alias resolution leading to Duplicate schema alias errors --- Key: PIG-3144 URL: https://issues.apache.org/jira/browse/PIG-3144 Project: Pig Issue Type: Bug Affects Versions: 0.11, 0.10.1 Reporter: Kai Londenberg Assignee: Jonathan Coveney Fix For: 0.12 Attachments: PIG-3144-0.patch, PIG-3144-1.patch The following code illustrates a problem concerning alias resolution in pig The schema of D2 will incorrectly be described as containing two age fields. And the last step in the following script will lead to a Duplicate schema alias error message. I only encountered this bug when using aliases for map fields. {code} DATA = LOAD 'file:///whatever' as (a:map[chararray], b:chararray); D1 = FOREACH DATA GENERATE a#'name' as name, a#'age' as age, b; D2 = FOREACH D1 GENERATE name, age, b; DESCRIBE D2; {code} Output: {code} D2: { age: chararray, age: chararray, b: chararray } {code} {code} D3 = FOREACH D2 GENERATE *; DESCRIBE D3; {code} Output: {code} file file:///.../pig-bug-example.pig, line 20, column 16 Duplicate schema alias: age {code} This error occurs in this form in Apache Pig version 0.11.0-SNAPSHOT (r6408). A less severe variant of this bug is also present in pig 0.10.1. In 0.10.1, the Duplicate schema alias error message won't occur, but the schema of D2 (see above) will still have wrong duplicate alias entries. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-2988) start deploying pigunit maven artifact part of Pig release process
[ https://issues.apache.org/jira/browse/PIG-2988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick White updated PIG-2988: Attachment: PIG-2988.0.patch The attached patch uploads the pigunit and pigsmoke jars when a release is deployed. Currently the pigunit and pigsmoke artifacts are uploaded as snapshots to the apache repo (e.g. https://repository.apache.org/content/repositories/snapshots/org/apache/pig/pigunit/0.12.0-SNAPSHOT), so this is a fairly small change. start deploying pigunit maven artifact part of Pig release process -- Key: PIG-2988 URL: https://issues.apache.org/jira/browse/PIG-2988 Project: Pig Issue Type: New Feature Components: build Affects Versions: 0.11, 0.10.1 Reporter: Johnny Zhang Attachments: PIG-2988.0.patch right now the Pig project doesn't publish pigunit Maven artifact, thins like {noformat} dependency groupIdorg.apache.pig/groupId artifactIdpigunit/artifactId version0.10.0/version /dependency {noformat} doesn't work. Can we start deploy pigunit Maven artifacts as part of the release process? Thanks. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3231) Problems with pig (TRUNK, 0.11) after upgrading to CDH4.2(yarn) using avro input
[ https://issues.apache.org/jira/browse/PIG-3231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13592386#comment-13592386 ] Cheolsoo Park commented on PIG-3231: [~rohini], thank you very much for correcting me! You're absolutely right. In fact, I should have said this: The case that I have seen before is that Flume AvroSinks randomly die while writing Avro files, so the files that they were writing to are not properly closed. This leaves corrupted files in a directory. Now Pig launches a job on that directory, and jobs fail during execution since files cannot be either opened or read. I found that it is common with my customers that they load files on HDFS by another tool and run Pig jobs on them at the same time. Apparently, this often leads to what you're saying: {quote} This exception happens when some code has closed the filesystem and another piece of code has reference to the same FileSystem object (because of the FileSystem cache). A quick glance at AvroStorage does not have a fs.close() though. {quote} The way I dealt with was ignoring bad files. In AvroStorage, there are several places (about 6 places IIRC) that can throw an IOException, so I caught them in PigRecordReader instead, drop the input split entirely and move on. Obviously, this is not a perfect solution, but in the short term, it seems to work so far. Problems with pig (TRUNK, 0.11) after upgrading to CDH4.2(yarn) using avro input Key: PIG-3231 URL: https://issues.apache.org/jira/browse/PIG-3231 Project: Pig Issue Type: Bug Affects Versions: 0.11 Environment: CDH4.2, yarn, avro Reporter: Tobias Schlottke Hi there, we've got a strange issue after switching to a new cluster with cdh4.2 (from cdh3): Pig seems to create temporary avro files for its map reduce jobs, which it either deletes or never creates. Pig fails with the no error returned by hadoop-message, but in nn-logs I found something interesting. The actual exception from nn-log is: a {code} 2013-03-01 12:59:30,858 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 8020, call org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from 192.168.1.28:37814: error: org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on /user/metrigo/event_logger/compact_log/2013/01/14/_temporary/1/_temporary/attempt_1362133122980_0017_m_07_0/part-m-7.avro File does not exist. Holder DFSClient_attempt_1362133122980_0017_m_07_0_1992466008_1 does not have any open files. org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on /user/metrigo/event_logger/compact_log/2013/01/14/_temporary/1/_temporary/attempt_1362133122980_0017_m_07_0/part-m-7.avro File does not exist. Holder DFSClient_attempt_1362133122980_0017_m_07_0_1992466008_1 does not have any open files. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2396) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2387) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2183) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:481) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:297) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44080) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1695) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1691) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:416) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1689) {code} Please note that we're analyzing a bunch of files (~200 files, we're using glob matchers), some of them are small. We made it work once without the small files. *Update* I found the following exception deep in the logs that seems to make the job fail: {code} 2013-03-03 19:51:06,169 ERROR [main] org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:metrigo (auth:SIMPLE) cause:java.io.IOException: org.apache.avro.AvroRuntimeException: java.io.IOException: Filesystem
[jira] [Updated] (PIG-2988) start deploying pigunit maven artifact part of Pig release process
[ https://issues.apache.org/jira/browse/PIG-2988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick White updated PIG-2988: Assignee: Nick White Status: Patch Available (was: Open) start deploying pigunit maven artifact part of Pig release process -- Key: PIG-2988 URL: https://issues.apache.org/jira/browse/PIG-2988 Project: Pig Issue Type: New Feature Components: build Affects Versions: 0.11, 0.10.1 Reporter: Johnny Zhang Assignee: Nick White Attachments: PIG-2988.0.patch right now the Pig project doesn't publish pigunit Maven artifact, thins like {noformat} dependency groupIdorg.apache.pig/groupId artifactIdpigunit/artifactId version0.10.0/version /dependency {noformat} doesn't work. Can we start deploy pigunit Maven artifacts as part of the release process? Thanks. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (PIG-3233) Deploy a Piggybank Jar
Nick White created PIG-3233: --- Summary: Deploy a Piggybank Jar Key: PIG-3233 URL: https://issues.apache.org/jira/browse/PIG-3233 Project: Pig Issue Type: New Feature Components: piggybank Affects Versions: 0.11, 0.10.0 Reporter: Nick White Assignee: Nick White Fix For: 0.10.1, 0.11.1 Attachments: PIG-3233.0.patch The attached patch adds the piggybank contrib jar to the mvn-install and mvn-deploy ant targets in the same way as the pigunit pigsmoke artifacts. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3233) Deploy a Piggybank Jar
[ https://issues.apache.org/jira/browse/PIG-3233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick White updated PIG-3233: Attachment: PIG-3233.0.patch Deploy a Piggybank Jar -- Key: PIG-3233 URL: https://issues.apache.org/jira/browse/PIG-3233 Project: Pig Issue Type: New Feature Components: piggybank Affects Versions: 0.10.0, 0.11 Reporter: Nick White Assignee: Nick White Fix For: 0.10.1, 0.11.1 Attachments: PIG-3233.0.patch The attached patch adds the piggybank contrib jar to the mvn-install and mvn-deploy ant targets in the same way as the pigunit pigsmoke artifacts. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3233) Deploy a Piggybank Jar
[ https://issues.apache.org/jira/browse/PIG-3233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick White updated PIG-3233: Status: Patch Available (was: Open) Deploy a Piggybank Jar -- Key: PIG-3233 URL: https://issues.apache.org/jira/browse/PIG-3233 Project: Pig Issue Type: New Feature Components: piggybank Affects Versions: 0.11, 0.10.0 Reporter: Nick White Assignee: Nick White Fix For: 0.10.1, 0.11.1 Attachments: PIG-3233.0.patch The attached patch adds the piggybank contrib jar to the mvn-install and mvn-deploy ant targets in the same way as the pigunit pigsmoke artifacts. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3136) Introduce a syntax making declared aliases optional
[ https://issues.apache.org/jira/browse/PIG-3136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13592432#comment-13592432 ] Cheolsoo Park commented on PIG-3136: +1. I noticed that you modified dumpSchema() and dumpSchemaNested() in PigServer. The unit tests fully passed with last patch, but let me run another run of unit test with the new patch. I am almost certain that no test will fail; nevertheless, it's always good to verify. :-) Introduce a syntax making declared aliases optional --- Key: PIG-3136 URL: https://issues.apache.org/jira/browse/PIG-3136 Project: Pig Issue Type: Improvement Reporter: Jonathan Coveney Assignee: Jonathan Coveney Fix For: 0.12 Attachments: PIG-3136-0.patch, PIG-3136-1.patch, PIG-3136-2.patch, PIG-3136-3.patch This is something Daniel and I have talked about before, and now that we have the @ syntax, this is easy to implement. The idea is that relation names are no longer required, and you can instead use a fat arrow (obviously that can be changed) to signify this. The benefit is not having to engage in the mental load of having to name everything. One other possibility is just making alias = optional. I fear that that could be a little TOO magical, but I welcome opinions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3211) Allow default Load/Store funcs to be configurable
[ https://issues.apache.org/jira/browse/PIG-3211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13592486#comment-13592486 ] Cheolsoo Park commented on PIG-3211: [~prkommireddi], I have two comments: # I think we can simplify the code. Why don't we do this? {code:title=buildLoadOp} -FuncSpec instantiatedFuncSpec = -funcSpec == null ? -new FuncSpec(PigStorage.class.getName()) : -funcSpec; -loFunc = (LoadFunc)PigContext.instantiateFuncFromSpec(instantiatedFuncSpec); -String fileNameKey = QueryParserUtils.constructFileNameSignature(filename, instantiatedFuncSpec) + _ + (loadIndex++); +String defaultLoadFuncName = pigContext.getProperties().getProperty(pig.default.load.func, PigStorage.class.getName()); +funcSpec = funcSpec == null ? new FuncSpec(defaultLoadFuncName): funcSpec; +loFunc = (LoadFunc)PigContext.instantiateFuncFromSpec(funcSpec); +String fileNameKey = QueryParserUtils.constructFileNameSignature(filename, funcSpec) + _ + (loadIndex++); {code} {code:title=buildStoreOp} -FuncSpec instantiatedFuncSpec = -funcSpec == null ? -new FuncSpec(PigStorage.class.getName()): -funcSpec; - -StoreFuncInterface stoFunc = (StoreFuncInterface)PigContext.instantiateFuncFromSpec(instantiatedFuncSpec); +String defaultStoreFuncName = pigContext.getProperties().getProperty(pig.default.store.func, PigStorage.class.getName()); +funcSpec = funcSpec == null ? new FuncSpec(defaultStoreFuncName): funcSpec; +StoreFuncInterface stoFunc = (StoreFuncInterface)PigContext.instantiateFuncFromSpec(funcSpec); {code} I can confirm that your test cases pass with this, so I don't think we need the helper functions in Utils.java. # In addition, we should change the following in QueryParserUtils.java: {code:title=QueryParserUtils.java} public static void attachStorePlan(String scope, LogicalPlan lp, String fileName, String func, Operator input, String alias, PigContext pigContext) throws FrontendException { if( func == null ) { -func = PigStorage.class.getName(); +func = pigContext.getProperties().getProperty(pig.default.store.func, PigStorage.class.getName()); } {code} Let me know what you think. Allow default Load/Store funcs to be configurable - Key: PIG-3211 URL: https://issues.apache.org/jira/browse/PIG-3211 Project: Pig Issue Type: New Feature Affects Versions: 0.12 Reporter: Prashant Kommireddi Assignee: Prashant Kommireddi Fix For: 0.12 Attachments: PIG-3211.patch PigStorage is used by default when a Load/StoreFunc is not specified. It would be useful to make this configurable. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3136) Introduce a syntax making declared aliases optional
[ https://issues.apache.org/jira/browse/PIG-3136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13592518#comment-13592518 ] Jonathan Coveney commented on PIG-3136: --- Cheolsoo, can you update this when the tests pass? Introduce a syntax making declared aliases optional --- Key: PIG-3136 URL: https://issues.apache.org/jira/browse/PIG-3136 Project: Pig Issue Type: Improvement Reporter: Jonathan Coveney Assignee: Jonathan Coveney Fix For: 0.12 Attachments: PIG-3136-0.patch, PIG-3136-1.patch, PIG-3136-2.patch, PIG-3136-3.patch This is something Daniel and I have talked about before, and now that we have the @ syntax, this is easy to implement. The idea is that relation names are no longer required, and you can instead use a fat arrow (obviously that can be changed) to signify this. The benefit is not having to engage in the mental load of having to name everything. One other possibility is just making alias = optional. I fear that that could be a little TOO magical, but I welcome opinions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3136) Introduce a syntax making declared aliases optional
[ https://issues.apache.org/jira/browse/PIG-3136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13592534#comment-13592534 ] Cheolsoo Park commented on PIG-3136: Will do. It's likely that you will find your patch committed tomorrow morning. Introduce a syntax making declared aliases optional --- Key: PIG-3136 URL: https://issues.apache.org/jira/browse/PIG-3136 Project: Pig Issue Type: Improvement Reporter: Jonathan Coveney Assignee: Jonathan Coveney Fix For: 0.12 Attachments: PIG-3136-0.patch, PIG-3136-1.patch, PIG-3136-2.patch, PIG-3136-3.patch This is something Daniel and I have talked about before, and now that we have the @ syntax, this is easy to implement. The idea is that relation names are no longer required, and you can instead use a fat arrow (obviously that can be changed) to signify this. The benefit is not having to engage in the mental load of having to name everything. One other possibility is just making alias = optional. I fear that that could be a little TOO magical, but I welcome opinions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3144) Erroneous map entry alias resolution leading to Duplicate schema alias errors
[ https://issues.apache.org/jira/browse/PIG-3144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheolsoo Park updated PIG-3144: --- Resolution: Fixed Fix Version/s: 0.11.1 Status: Resolved (was: Patch Available) Committed to trunk and 0.11. Note that I replaced @'s with relation names from the new test case in 0.11 because it isn't supported in 0.11. Erroneous map entry alias resolution leading to Duplicate schema alias errors --- Key: PIG-3144 URL: https://issues.apache.org/jira/browse/PIG-3144 Project: Pig Issue Type: Bug Affects Versions: 0.11, 0.10.1 Reporter: Kai Londenberg Assignee: Jonathan Coveney Fix For: 0.12, 0.11.1 Attachments: PIG-3144-0.patch, PIG-3144-1-branch-0.11.patch, PIG-3144-1.patch The following code illustrates a problem concerning alias resolution in pig The schema of D2 will incorrectly be described as containing two age fields. And the last step in the following script will lead to a Duplicate schema alias error message. I only encountered this bug when using aliases for map fields. {code} DATA = LOAD 'file:///whatever' as (a:map[chararray], b:chararray); D1 = FOREACH DATA GENERATE a#'name' as name, a#'age' as age, b; D2 = FOREACH D1 GENERATE name, age, b; DESCRIBE D2; {code} Output: {code} D2: { age: chararray, age: chararray, b: chararray } {code} {code} D3 = FOREACH D2 GENERATE *; DESCRIBE D3; {code} Output: {code} file file:///.../pig-bug-example.pig, line 20, column 16 Duplicate schema alias: age {code} This error occurs in this form in Apache Pig version 0.11.0-SNAPSHOT (r6408). A less severe variant of this bug is also present in pig 0.10.1. In 0.10.1, the Duplicate schema alias error message won't occur, but the schema of D2 (see above) will still have wrong duplicate alias entries. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3144) Erroneous map entry alias resolution leading to Duplicate schema alias errors
[ https://issues.apache.org/jira/browse/PIG-3144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheolsoo Park updated PIG-3144: --- Attachment: PIG-3144-1-branch-0.11.patch Attaching the 0.11 patch for the record. Erroneous map entry alias resolution leading to Duplicate schema alias errors --- Key: PIG-3144 URL: https://issues.apache.org/jira/browse/PIG-3144 Project: Pig Issue Type: Bug Affects Versions: 0.11, 0.10.1 Reporter: Kai Londenberg Assignee: Jonathan Coveney Fix For: 0.12, 0.11.1 Attachments: PIG-3144-0.patch, PIG-3144-1-branch-0.11.patch, PIG-3144-1.patch The following code illustrates a problem concerning alias resolution in pig The schema of D2 will incorrectly be described as containing two age fields. And the last step in the following script will lead to a Duplicate schema alias error message. I only encountered this bug when using aliases for map fields. {code} DATA = LOAD 'file:///whatever' as (a:map[chararray], b:chararray); D1 = FOREACH DATA GENERATE a#'name' as name, a#'age' as age, b; D2 = FOREACH D1 GENERATE name, age, b; DESCRIBE D2; {code} Output: {code} D2: { age: chararray, age: chararray, b: chararray } {code} {code} D3 = FOREACH D2 GENERATE *; DESCRIBE D3; {code} Output: {code} file file:///.../pig-bug-example.pig, line 20, column 16 Duplicate schema alias: age {code} This error occurs in this form in Apache Pig version 0.11.0-SNAPSHOT (r6408). A less severe variant of this bug is also present in pig 0.10.1. In 0.10.1, the Duplicate schema alias error message won't occur, but the schema of D2 (see above) will still have wrong duplicate alias entries. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3211) Allow default Load/Store funcs to be configurable
[ https://issues.apache.org/jira/browse/PIG-3211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prashant Kommireddi updated PIG-3211: - Attachment: PIG-3211_2.patch Thanks [~cheolsoo]. I have updated the patch (may be simplified a bit further). I also took Jon's suggestion of using PigConfiguration to define the new properties. The same has been documented in pig.properties for users. Allow default Load/Store funcs to be configurable - Key: PIG-3211 URL: https://issues.apache.org/jira/browse/PIG-3211 Project: Pig Issue Type: New Feature Affects Versions: 0.12 Reporter: Prashant Kommireddi Assignee: Prashant Kommireddi Fix For: 0.12 Attachments: PIG-3211_2.patch, PIG-3211.patch PigStorage is used by default when a Load/StoreFunc is not specified. It would be useful to make this configurable. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3211) Allow default Load/Store funcs to be configurable
[ https://issues.apache.org/jira/browse/PIG-3211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13592660#comment-13592660 ] Cheolsoo Park commented on PIG-3211: +1. I will wait a day before committing just to see Jonathan has more suggestions. Thank you Prashant! Allow default Load/Store funcs to be configurable - Key: PIG-3211 URL: https://issues.apache.org/jira/browse/PIG-3211 Project: Pig Issue Type: New Feature Affects Versions: 0.12 Reporter: Prashant Kommireddi Assignee: Prashant Kommireddi Fix For: 0.12 Attachments: PIG-3211_2.patch, PIG-3211.patch PigStorage is used by default when a Load/StoreFunc is not specified. It would be useful to make this configurable. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3211) Allow default Load/Store funcs to be configurable
[ https://issues.apache.org/jira/browse/PIG-3211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13592710#comment-13592710 ] Jonathan Coveney commented on PIG-3211: --- looks fine to me. go ahead and commit it cheolsoo :) Allow default Load/Store funcs to be configurable - Key: PIG-3211 URL: https://issues.apache.org/jira/browse/PIG-3211 Project: Pig Issue Type: New Feature Affects Versions: 0.12 Reporter: Prashant Kommireddi Assignee: Prashant Kommireddi Fix For: 0.12 Attachments: PIG-3211_2.patch, PIG-3211.patch PigStorage is used by default when a Load/StoreFunc is not specified. It would be useful to make this configurable. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3233) Deploy a Piggybank Jar
[ https://issues.apache.org/jira/browse/PIG-3233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13592712#comment-13592712 ] Bill Graham commented on PIG-3233: -- Thanks for tackling this one Nick! The mechanics of the patch looks good, but from where did you get the deps that you included in {{ivy/piggybank-template.xml}}? Deploy a Piggybank Jar -- Key: PIG-3233 URL: https://issues.apache.org/jira/browse/PIG-3233 Project: Pig Issue Type: New Feature Components: piggybank Affects Versions: 0.10.0, 0.11 Reporter: Nick White Assignee: Nick White Fix For: 0.10.1, 0.11.1 Attachments: PIG-3233.0.patch The attached patch adds the piggybank contrib jar to the mvn-install and mvn-deploy ant targets in the same way as the pigunit pigsmoke artifacts. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3211) Allow default Load/Store funcs to be configurable
[ https://issues.apache.org/jira/browse/PIG-3211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13592716#comment-13592716 ] Cheolsoo Park commented on PIG-3211: OK, I will run unit test now. Allow default Load/Store funcs to be configurable - Key: PIG-3211 URL: https://issues.apache.org/jira/browse/PIG-3211 Project: Pig Issue Type: New Feature Affects Versions: 0.12 Reporter: Prashant Kommireddi Assignee: Prashant Kommireddi Fix For: 0.12 Attachments: PIG-3211_2.patch, PIG-3211.patch PigStorage is used by default when a Load/StoreFunc is not specified. It would be useful to make this configurable. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3183) rm or rmf commands should respect globbing/regex of path
[ https://issues.apache.org/jira/browse/PIG-3183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13592762#comment-13592762 ] Cheolsoo Park commented on PIG-3183: [~prkommireddi], please correct me if I am wrong. * As of now, these limitations can be easily worked around by using fs commands (i.e. fs -rm * and fs -ls *). * Given these commands are not documented (and thus not official), I would encourage users to use fs commands. I do not know why these commands are added in the first place, and we should keep them for backward compatibility for a while. But eventually I would like to get rid of them since they're duplicate IMO. rm or rmf commands should respect globbing/regex of path Key: PIG-3183 URL: https://issues.apache.org/jira/browse/PIG-3183 Project: Pig Issue Type: Improvement Components: grunt Affects Versions: 0.10.0 Reporter: Prashant Kommireddi Assignee: Prashant Kommireddi Fix For: 0.12 Attachments: PIG-3183.patch Hadoop fs commands support globbing during deleting files/dirs. Pig is not consistent with this behavior and seems like we could change rm/rmf commands to do the same. For eg: {code} localhost:pig pkommireddi$ ls -ld out* drwxr-xr-x 12 pkommireddi SF\domain users 408 Feb 13 01:09 out drwxr-xr-x 2 pkommireddi SF\domain users 68 Feb 13 01:16 out1 drwxr-xr-x 2 pkommireddi SF\domain users 68 Feb 13 01:16 out2 localhost:pig pkommireddi$ bin/pig -x local grunt rmf out* grunt quit localhost:pig pkommireddi$ ls -ld out* drwxr-xr-x 12 pkommireddi SF\domain users 408 Feb 13 01:09 out drwxr-xr-x 2 pkommireddi SF\domain users 68 Feb 13 01:16 out1 drwxr-xr-x 2 pkommireddi SF\domain users 68 Feb 13 01:16 out2 {code} Ideally, the user would expect rmf out* to delete all of the above dirs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-2988) start deploying pigunit maven artifact part of Pig release process
[ https://issues.apache.org/jira/browse/PIG-2988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohini Palaniswamy updated PIG-2988: Attachment: PIG-2988.0-branch11.patch Attaching patch for branch-0.11. The patch for trunk did not apply due to whitespace issues. start deploying pigunit maven artifact part of Pig release process -- Key: PIG-2988 URL: https://issues.apache.org/jira/browse/PIG-2988 Project: Pig Issue Type: New Feature Components: build Affects Versions: 0.11, 0.10.1 Reporter: Johnny Zhang Assignee: Nick White Attachments: PIG-2988.0-branch11.patch, PIG-2988.0.patch right now the Pig project doesn't publish pigunit Maven artifact, thins like {noformat} dependency groupIdorg.apache.pig/groupId artifactIdpigunit/artifactId version0.10.0/version /dependency {noformat} doesn't work. Can we start deploy pigunit Maven artifacts as part of the release process? Thanks. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3081) Pig progress stays at 0% for the first job in hadoop 23
[ https://issues.apache.org/jira/browse/PIG-3081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13592783#comment-13592783 ] Cheolsoo Park commented on PIG-3081: +1. LGTM. Pig progress stays at 0% for the first job in hadoop 23 --- Key: PIG-3081 URL: https://issues.apache.org/jira/browse/PIG-3081 Project: Pig Issue Type: Bug Affects Versions: 0.10.0 Reporter: Rohini Palaniswamy Assignee: Rohini Palaniswamy Fix For: 0.12 Attachments: PIG-3081-1.patch, PIG-3081.patch We are seeing that for many scripts if there are multiple jobs in the job graph, progress stays at 0% for the first job and jumps to 33% when the first job completes. There is no intermediate progress. After that intermediate progress gets reported for the subsequent jobs. Noticed this with jobs that do filtering and order by. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3183) rm or rmf commands should respect globbing/regex of path
[ https://issues.apache.org/jira/browse/PIG-3183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13592808#comment-13592808 ] Prashant Kommireddi commented on PIG-3183: -- Right. I receive a lot of questions regarding rm/ls behavior and I point them to fs commands. But it's a pain for users to start using it and realize it doesn't work. I would be in favor of deprecating these or may be even translating it to fs under the hood. rm or rmf commands should respect globbing/regex of path Key: PIG-3183 URL: https://issues.apache.org/jira/browse/PIG-3183 Project: Pig Issue Type: Improvement Components: grunt Affects Versions: 0.10.0 Reporter: Prashant Kommireddi Assignee: Prashant Kommireddi Fix For: 0.12 Attachments: PIG-3183.patch Hadoop fs commands support globbing during deleting files/dirs. Pig is not consistent with this behavior and seems like we could change rm/rmf commands to do the same. For eg: {code} localhost:pig pkommireddi$ ls -ld out* drwxr-xr-x 12 pkommireddi SF\domain users 408 Feb 13 01:09 out drwxr-xr-x 2 pkommireddi SF\domain users 68 Feb 13 01:16 out1 drwxr-xr-x 2 pkommireddi SF\domain users 68 Feb 13 01:16 out2 localhost:pig pkommireddi$ bin/pig -x local grunt rmf out* grunt quit localhost:pig pkommireddi$ ls -ld out* drwxr-xr-x 12 pkommireddi SF\domain users 408 Feb 13 01:09 out drwxr-xr-x 2 pkommireddi SF\domain users 68 Feb 13 01:16 out1 drwxr-xr-x 2 pkommireddi SF\domain users 68 Feb 13 01:16 out2 {code} Ideally, the user would expect rmf out* to delete all of the above dirs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Subscription: PIG patch available
Issue Subscription Filter: PIG patch available (33 issues) Subscriber: pigdaily Key Summary PIG-3233Deploy a Piggybank Jar https://issues.apache.org/jira/browse/PIG-3233 PIG-3215[piggybank] Add LTSVLoader to load LTSV (Labeled Tab-separated Values) files https://issues.apache.org/jira/browse/PIG-3215 PIG-3211Allow default Load/Store funcs to be configurable https://issues.apache.org/jira/browse/PIG-3211 PIG-3210Pig fails to start when it cannot write log to log files https://issues.apache.org/jira/browse/PIG-3210 PIG-3208[zebra] TFile should not set io.compression.codec.lzo.buffersize https://issues.apache.org/jira/browse/PIG-3208 PIG-3205Passing arguments to python script does not work with -f option https://issues.apache.org/jira/browse/PIG-3205 PIG-3198Let users use any function from PigType - PigType as if it were builtlin https://issues.apache.org/jira/browse/PIG-3198 PIG-3183rm or rmf commands should respect globbing/regex of path https://issues.apache.org/jira/browse/PIG-3183 PIG-3172Partition filter push down does not happen when there is a non partition key map column filter https://issues.apache.org/jira/browse/PIG-3172 PIG-3166Update eclipse .classpath according to ivy library.properties https://issues.apache.org/jira/browse/PIG-3166 PIG-3164Pig current releases lack a UDF endsWith.This UDF tests if a given string ends with the specified suffix. https://issues.apache.org/jira/browse/PIG-3164 PIG-3141Giving CSVExcelStorage an option to handle header rows https://issues.apache.org/jira/browse/PIG-3141 PIG-3136Introduce a syntax making declared aliases optional https://issues.apache.org/jira/browse/PIG-3136 PIG-3123Simplify Logical Plans By Removing Unneccessary Identity Projections https://issues.apache.org/jira/browse/PIG-3123 PIG-3122Operators should not implicitly become reserved keywords https://issues.apache.org/jira/browse/PIG-3122 PIG-3114Duplicated macro name error when using pigunit https://issues.apache.org/jira/browse/PIG-3114 PIG-3105Fix TestJobSubmission unit test failure. https://issues.apache.org/jira/browse/PIG-3105 PIG-3088Add a builtin udf which removes prefixes https://issues.apache.org/jira/browse/PIG-3088 PIG-3081Pig progress stays at 0% for the first job in hadoop 23 https://issues.apache.org/jira/browse/PIG-3081 PIG-3069Native Windows Compatibility for Pig E2E Tests and Harness https://issues.apache.org/jira/browse/PIG-3069 PIG-3028testGrunt dev test needs some command filters to run correctly without cygwin https://issues.apache.org/jira/browse/PIG-3028 PIG-3027pigTest unit test needs a newline filter for comparisons of golden multi-line https://issues.apache.org/jira/browse/PIG-3027 PIG-3026Pig checked-in baseline comparisons need a pre-filter to address OS-specific newline differences https://issues.apache.org/jira/browse/PIG-3026 PIG-3024TestEmptyInputDir unit test - hadoop version detection logic is brittle https://issues.apache.org/jira/browse/PIG-3024 PIG-3015Rewrite of AvroStorage https://issues.apache.org/jira/browse/PIG-3015 PIG-3010Allow UDF's to flatten themselves https://issues.apache.org/jira/browse/PIG-3010 PIG-2988start deploying pigunit maven artifact part of Pig release process https://issues.apache.org/jira/browse/PIG-2988 PIG-2959Add a pig.cmd for Pig to run under Windows https://issues.apache.org/jira/browse/PIG-2959 PIG-2955 Fix bunch of Pig e2e tests on Windows https://issues.apache.org/jira/browse/PIG-2955 PIG-2643Use bytecode generation to make a performance replacement for InvokeForLong, InvokeForString, etc https://issues.apache.org/jira/browse/PIG-2643 PIG-2641Create toJSON function for all complex types: tuples, bags and maps https://issues.apache.org/jira/browse/PIG-2641 PIG-2591Unit tests should not write to /tmp but respect java.io.tmpdir https://issues.apache.org/jira/browse/PIG-2591 PIG-1914Support load/store JSON data in Pig https://issues.apache.org/jira/browse/PIG-1914 You may edit this subscription at: https://issues.apache.org/jira/secure/FilterSubscription!default.jspa?subId=13225filterId=12322384
[jira] [Commented] (PIG-3233) Deploy a Piggybank Jar
[ https://issues.apache.org/jira/browse/PIG-3233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13592893#comment-13592893 ] Nick White commented on PIG-3233: - I just used the imports - grep -hr '^import' contrib/piggybank/java/src/main | sort | uniq for the compile-time dependencies, and: grep -hr '^import' contrib/piggybank/java/src/test | sort | uniq for anything else for the test scope. I'm not sure there's any better way of keeping the ivy and maven dependencies in sync (especially as the template poms don't use the ivy.properties file to pick up their versions). Deploy a Piggybank Jar -- Key: PIG-3233 URL: https://issues.apache.org/jira/browse/PIG-3233 Project: Pig Issue Type: New Feature Components: piggybank Affects Versions: 0.10.0, 0.11 Reporter: Nick White Assignee: Nick White Fix For: 0.10.1, 0.11.1 Attachments: PIG-3233.0.patch The attached patch adds the piggybank contrib jar to the mvn-install and mvn-deploy ant targets in the same way as the pigunit pigsmoke artifacts. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3194) Changes to ObjectSerializer.java break compatibility with Hadoop 0.20.2
[ https://issues.apache.org/jira/browse/PIG-3194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13592896#comment-13592896 ] Prashant Kommireddi commented on PIG-3194: -- Would be great to have some ideas from others and discuss :) Changes to ObjectSerializer.java break compatibility with Hadoop 0.20.2 --- Key: PIG-3194 URL: https://issues.apache.org/jira/browse/PIG-3194 Project: Pig Issue Type: Bug Affects Versions: 0.11 Reporter: Kai Londenberg The changes to ObjectSerializer.java in the following commit http://svn.apache.org/viewvc?view=revisionrevision=1403934 break compatibility with Hadoop 0.20.2 Clusters. The reason is, that the code uses methods from Apache Commons Codec 1.4 - which are not available in Apache Commons Codec 1.3 which is shipping with Hadoop 0.20.2. The offending methods are Base64.decodeBase64(String) and Base64.encodeBase64URLSafeString(byte[]) If I revert these changes, Pig 0.11.0 candidate 2 works well with our Hadoop 0.20.2 Clusters. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3214) New/improved mascot
[ https://issues.apache.org/jira/browse/PIG-3214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth J updated PIG-3214: Attachment: newlogo1.png newlogo2.png newlogo3.png newlogo4.png New/improved mascot --- Key: PIG-3214 URL: https://issues.apache.org/jira/browse/PIG-3214 Project: Pig Issue Type: Wish Components: site Affects Versions: 0.11 Reporter: Andrew Musselman Priority: Minor Fix For: 0.12 Attachments: newlogo1.png, newlogo2.png, newlogo3.png, newlogo4.png Request to change pig mascot to something more graphically appealing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3214) New/improved mascot
[ https://issues.apache.org/jira/browse/PIG-3214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13593065#comment-13593065 ] Prashant Kommireddi commented on PIG-3214: -- Thanks Prasanth, these look great. My vote would go to newlogo2! New/improved mascot --- Key: PIG-3214 URL: https://issues.apache.org/jira/browse/PIG-3214 Project: Pig Issue Type: Wish Components: site Affects Versions: 0.11 Reporter: Andrew Musselman Priority: Minor Fix For: 0.12 Attachments: newlogo1.png, newlogo2.png, newlogo3.png, newlogo4.png Request to change pig mascot to something more graphically appealing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3214) New/improved mascot
[ https://issues.apache.org/jira/browse/PIG-3214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13593082#comment-13593082 ] Cheolsoo Park commented on PIG-3214: My +1 to #2 as well. Thank you Prasanth for the hard work! New/improved mascot --- Key: PIG-3214 URL: https://issues.apache.org/jira/browse/PIG-3214 Project: Pig Issue Type: Wish Components: site Affects Versions: 0.11 Reporter: Andrew Musselman Priority: Minor Fix For: 0.12 Attachments: newlogo1.png, newlogo2.png, newlogo3.png, newlogo4.png Request to change pig mascot to something more graphically appealing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3214) New/improved mascot
[ https://issues.apache.org/jira/browse/PIG-3214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13593122#comment-13593122 ] Prasanth J commented on PIG-3214: - Adding one more similar to #2. New/improved mascot --- Key: PIG-3214 URL: https://issues.apache.org/jira/browse/PIG-3214 Project: Pig Issue Type: Wish Components: site Affects Versions: 0.11 Reporter: Andrew Musselman Priority: Minor Fix For: 0.12 Attachments: newlogo1.png, newlogo2.png, newlogo3.png, newlogo4.png, newlogo5.png Request to change pig mascot to something more graphically appealing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3214) New/improved mascot
[ https://issues.apache.org/jira/browse/PIG-3214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth J updated PIG-3214: Attachment: newlogo5.png New/improved mascot --- Key: PIG-3214 URL: https://issues.apache.org/jira/browse/PIG-3214 Project: Pig Issue Type: Wish Components: site Affects Versions: 0.11 Reporter: Andrew Musselman Priority: Minor Fix For: 0.12 Attachments: newlogo1.png, newlogo2.png, newlogo3.png, newlogo4.png, newlogo5.png Request to change pig mascot to something more graphically appealing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-2507) Semicolon in paramenters for UDF results in parsing error
[ https://issues.apache.org/jira/browse/PIG-2507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timothy Chen updated PIG-2507: -- Affects Version/s: 0.11.1 0.11 Semicolon in paramenters for UDF results in parsing error - Key: PIG-2507 URL: https://issues.apache.org/jira/browse/PIG-2507 Project: Pig Issue Type: Bug Affects Versions: 0.8.0, 0.9.1, 0.10.0, 0.11, 0.11.1 Reporter: Vivek Padmanabhan Assignee: Timothy Chen Attachments: PIG_2507.patch If I have a semicolon in the parameter passed to a udf, the script execution will fail with a parsing error. a = load 'i1' as (f1:chararray); c = foreach a generate REGEX_EXTRACT(f1, '.;' ,1); dump c; The above script fails with the below error [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1200: file test.pig, line 3, column 0 mismatched character 'EOF' expecting ''' Even replacing the semicolon with Unicode \u003B results in the same error. c = foreach a generate REGEX_EXTRACT(f1, '.\u003B',1); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira