Hi Meisam,

The status of the statement was “error”, but the state of the session was 
“idle” after the statement had finished failing.

I can only share a fairly heavily obfuscated version of my code, but I’ll do my 
best to still make it useful:

# The job starts out by running Python on the driver. We query an external 
database to get keys to use for our Spark job
                from data.Index import Index
                import datetime

                fname='/tmp/${filename}'
                indexjson = {
                                "loc": "${loc}",
                                "sample_size": None,
                                "query_predicates": ${query_predicates}
                }
                indx = Index("DATABASE", "${username}", "${password}")

# cntx is a Pandas Dataframe with some thousands of rows, size wise it’s about 
1 Mb
                cntx = indx.get_context(indexjson)

                from data.DataSpark import DataSpark
                adata = DataSpark()

# this is where the spark job happens. This takes the keys from the external 
index and reads data off of HDFS directly (no Hive SQL or even Spark Dataframes)
# it then returns them as a concatenated Pandas Dataframe
                result = adata.getData(sc=sc, indexdf=cntx)

# Writes to CSV in a tmp file, then uses pyarrow to write the tmp file to hDFS
                result.to_csv(fname, compression='gzip', chunksize=100000)
                print("Done Writing CSV TMP file - " + 
str(datetime.datetime.now()))

                import pyarrow as pa
                fs = pa.hdfs.connect()

                import os
                import io

                fs.upload('/user/hdfsprod/' + '${filename}', io.open(fname, 
mode='rb'))
                os.remove(fname)
                print("Done Writing CSV HDFS file - " + 
str(datetime.datetime.now()))

Thanks,
  Peter

From: Meisam Fathi <meisam.fa...@gmail.com>
Sent: Tuesday, April 23, 2019 11:44 PM
To: user@livy.incubator.apache.org
Subject: [EXT] Re: Code stops working after a few executions

What is the status of the job in Livy or in YARN?
Also, can you share your code, please?

Thanks,
Meisam

On Tue, Apr 23, 2019 at 9:08 PM Peter Wicks (pwicks) 
<pwi...@micron.com<mailto:pwi...@micron.com>> wrote:
I am working in Livy v0.4 with Python.  I’m using sessions.  If I run the same 
Python code over and over again it will work around four times, then on the 
next time I get the error:

Unexpected character ('#' (code 35)): expected a valid value (number, String, 
array, object, 'true', 'false' or 'null')
at [Source: #; line: 1, column: 2]

There aren’t any #’s in my code, and the code is identical between runs 
anyways… I made it identical following user reports of runs failing with this 
error message to try and reproduce the error.

Once this error starts appearing the session is useless, running code on it 
again does not succeed, though the status still shows as idle.

Any ideas on what might be going on here?

Thanks!
  Peter

Reply via email to