reductionista edited a comment on issue #355: Keras fit interface URL: https://github.com/apache/madlib/pull/355#issuecomment-473492589 When I run dev-check, I get this error: ``` psql:/tmp/madlib.xNO5vR/convex/madlib_keras.sql_in.tmp:60: ERROR: plpy.Error: A plpy error occurred in the step function: ImportError: No module named keras (plpython.c:5038) (seg0 slice1 127.0.0.1:25432 pid=98371) (plpython.c:5038) ``` I've investigated this, and I have a theory about what causes this. My keras is installed under /usr/local/lib/python2.7/site-packages, and that directory is included in my PYTHONPATH. If I run madlib functions that only include one level of calls to plpy, then everything is fine. And even in the madlib_keras_fit() function itself, the "import keras" statement at the beginning of malib_keras.py_in works the first time it is imported. But after it successfully imports keras, it executes a SQL command to run fit_step(). It's somewhere around this point that fit_step() modifies the environment to remove /usr/local/lib/python2.7/site-packages from the PYTHONPATH. And while madlib_keras.py is being loaded the second time to call fit_step(), it fails. This could be due to a general issue with Greenplum, namely that greeplum_path.sh always removes everything from the PYTHONPATH except for specific python library directories under $GPHOME. This isn't usually a problem, since you can call greenplum_path.sh first in .bashrc and then add any other directories afterwards. (What I do on my system.) But possibly, this is what's causing things to fail when there is a nested call to a plpy function inside a sql function inside a plpy function inside a sql fucntion. Not sure if there are other examples of this in our codebase? It seems like we may or may not care about fixing this, but we should at least add a requirement somewhere that keras has to be installed in a system directory that python knows about default, it can't be in a custom directory somewhere even if you set your PYTHONPATH to point to it.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services