Hi,

while calling python udf from pig(local). i am getting following error (attempt 
logs) & in the console getting the error mentioned below. Not able to trace 
where i am doing wrong as same code was working earlier. Using RHEL(6.4) on 64 
bit machine with 2.7.2 hadoop & 0.15 version pig & python 3.5
Traceback (most recent call last):
File "/tmp/controller2772959444531928936.py", line 356, in <module>
sys.argv[5], sys.argv[6], sys.argv[7], sys.argv[8])
File "/tmp/controller2772959444531928936.py", line 88, in main
input_str = self.get_next_input()
File "/tmp/controller2772959444531928936.py", line 164, in get_next_input
while input_str.endswith(END_RECORD_DELIM) == False:
TypeError: endswith first arg must be bytes or a tuple of bytes, not str
Following is the error at console:
java.lang.Exception: org.apache.pig.impl.streaming.StreamingUDFException: LINE 
: at 
org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462) 
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522) 
Caused by: org.apache.pig.impl.streaming.StreamingUDFException: LINE : at 
org.apache.pig.impl.builtin.StreamingUDF$ProcessErrorThread.run(StreamingUDF.java:503)

Exception in thread "Thread-35" java.lang.NullPointerException at 
org.apache.pig.impl.builtin.StreamingUDF$ProcessOutputThread.run(StreamingUDF.java:468)


Following is the python code:
@outputSchema('output_field_name:chararray')
def readfileinlist(filename):
    with open(filename) as inputfile:
            lines = inputfile.read().splitlines()
    return lines

@outputSchema('output_field_name:boolean')
def intlgtinlist(srcgt,destgt,intgtllist):
    if srcgt.startswith(tuple(intgtllist)) or 
destgt.startswith(tuple(intgtllist)):
            return True
    else:
            return False

@outputSchema('output_field_name:boolean')
def checkintlgtincdrs(aparty,srcgt,destgt):
    intgtllist = []
    try:
            if( (len(srcgt) > 0 or len(destgt) > 0) and (srcgt or destgt) and 
aparty.isdigit()):
                    if os.path.isfile(INTERNATIONALGTPATH) and 
os.access(INTERNATIONALGTPATH, os.R_OK) and 
os.stat(INTERNATIONALGTPATH).st_size > 0:

                            #FUNCTION FOR READING THE FILE IN ARRAY/TUPLE
                            intgtllist = readfileinlist(INTERNATIONALGTPATH)

                            #CHECK FOR THE INPUT(ARG0) IN ARRAY/TUPLE
                            if intlgtinlist(srcgt,destgt,intgtllist):
                                    return True
                            else:
                                    return False
                    else:
                            return False
            else:
                    return False
    except OSError or IndexError:
            pass

    return True


Following is the pig script

 record = LOAD '/inreport/cdrs/ZTE_20160301*' USING PigStorage('|','-tagFile');

 REGISTER 'udf_smsiuc.py' using streaming_python as smsiucudfs;

 internationalcdrsfilter = FILTER record by 
smsiucudfs.checkintlgtincdrs($1,$26,$27);

\

Best regards

Amit Sharma


________________________________

This E-Mail may contain Confidential and/or legally privileged Information and 
is meant for the intended recipient(s) only. If you have received this e-mail 
in error and are not the intended recipient/s, Kindly notify the sender and 
then delete this e-mail immediately from your system. You are also hereby 
notified that any use, any form of reproduction, dissemination, copying, 
disclosure, modification, distribution and/or publication of this e-mail, its 
contents or its attachment/s other than by its intended recipient/s is strictly 
prohibited and may be unlawful.

Internet Communications cannot be guaranteed to be secure or error-free as 
information could be delayed, intercepted, corrupted, lost, or contain viruses. 
Sistema Shyam Teleservices Limited does not accept any liability for any 
errors, omissions, viruses or computer problems experienced by any recipient as 
a result of this e-mail.

Reply via email to