Hey, On 9 July 2014 18:38, Andi Vajda <va...@apache.org> wrote: > > On Wed, 9 Jul 2014, Andi Vajda wrote: > >> >> >> On Wed, 9 Jul 2014, Lee Skillen wrote: >> >>> Hey Andi, >>> >>> On 9 July 2014 13:50, Andi Vajda <va...@apache.org> wrote: >>>> >>>> >>>> >>>> Hi Lee, >>>> >>>> >>>> On Tue, 8 Jul 2014, Lee Skillen wrote: >>>> >>>>> Andi, thanks for the reply - I've created a github mirror of the >>>>> pylucene >>>>> project for our own use (which I intend to keep synced with your SVN >>>>> repository as its official upstream), located at: >>>>> >>>>> https://github.com/lskillen/pylucene >>>>> >>>>> As suggested I have formatted (and attached) a patch of the unfinished >>>>> code >>>>> that we're using for the through-layer exceptions. Alternatively the >>>>> diff >>>>> can be inspected via github by diff'ing between the new >>>>> feature-thru-exception branch that I have created and the master >>>>> branch, as >>>>> such: >>>>> >>>>> >>>>> https://github.com/lskillen/pylucene/compare/feature-thru-exception?expand=1 >>>>> >>>>> Although we've run the test suite without issues I realise there may >>>>> still >>>>> be functionality/style/logical issues with the code. I also suspect >>>>> that >>>>> there may not be specific test cases that target regression failures >>>>> for >>>>> exceptions (yet), so confidence isn't high! All comments are welcome >>>>> and I >>>>> realise this will likely require further changes and a repeatable test >>>>> case >>>>> before it is acceptable, but that's fine. >>>> >>>> >>>> >>>> I took a look at your patch and used the main idea to rewrite it so >>>> that: >>>> - ref handling is correct (hopefully): you had it backwards when >>>> calling >>>> Py_Restore(), it steals what you currently must own. >>>> - the Python error state is now saved as one tuple of (type, value, >>>> tb) on >>>> PythonException, not as three strings and three fake Objects >>>> (longs). >>>> - ref handling is correct via a finalize() method DECREF'ing the saved >>>> error state tuple on PythonException when that exception is >>>> collected. >>>> - getMessage() still works as before but the traceback is extracted on >>>> demand, not at throw time as was done before. >>>> >>>> The previous implementation was done so that no Python cross-VM >>>> reference had to be tracked, all the error data was string'ified at error >>>> time. >>>> Your change now requires that the error be kept 'alive' accross layers >>>> and refs must be tracked to avoid leaks. >>>> >>> >>> Great feedback, thank you - I suspected that the reference handling >>> wasn't quite there (first foray into CPython internals), and I also >>> really wanted to utilise a tuple and get rid of the strings but wasn't >>> sure what the standard was for extending a class such as >>> PythonException. I actually did a quick attempt at running everything >>> under debug mode to inspect for leaks, but suffice to say that my >>> usual toolbox for debugging and tracking memory leaks >>> (gdb/clang/valgrind) didn't work too well - Had some minor success >>> with the excellent objgraph Python package, but would need to spend >>> more time on it. >>> >>>> >>>> I did _not_ test this as I don't have a system currently running that >>>> uses JCC in reverse like this. Since you're actively using the feature, >>>> please test the attached patch (against trunk) and report back with >>>> comments, bugs, fixes, etc... >>>> >>> >>> Not a problem! I applied your modified patch against trunk, rebuilt >>> and re-ran our application. The good news is that everything is >>> working really well, and the only bug that I could see was within the >>> RegisterNatives() call within registerNatives() for PythonException >>> (size was still 2, changed it to calculate the size of the struct). >>> I'll re-attach the patch with the changes for you to review. >> >> >> Woops, I missed that - even though I looked for it. Oh well. Thanks. > > > I attached a new patch with your fix. I also removed the 'clear()' method > since it's no longer necessary: once the PythonException is constructed, the > error state in the Python VM is cleared because of PyErr_Fetch() being > called during saveErrorState().
Changes are looking great - Applied against trunk again here and all working well. I've also added a new (simple) test case within the test directory under the pylucene root (test_PythonException.py) that checks to see if the through-layer exceptions are working, realistically it should probably be checking the traceback as well but I think this would suffice to show that the exceptions are propagating properly. On a side note, I did have an issue building pylucene because java/org/apache/pylucene/store/PythonIndexOutput.java refused to build, as it was referencing a BufferedIndexOutput that seems to have been removed by LUCENE-5678 (I just removed it in my local trunk but didn't want to add that to the patch). Cheers, Lee > > Andi.. > > >> >>> The only other comments I have are that: >>> >>> (1) This was still an issue with my patch, but I don't know if there >>> is anyone out there relying upon the exact format of the string that >>> gets returned by getMessage(), as it is probably going to be different >>> now that it is calculated at a different point - In saying that I >>> imagine you would only be doing that if you were specifically trying >>> to handle an exception from the Python layer and were trying to >>> re-create it using the string, which wouldn't be an issue now. >> >> >> Yeah, I'm just concatenating the strings. I'm already using PyErr_Print(). >> >>> (2) I also noticed that your patch doesn't have the #if >>> defined(PYTHON) part to protect certain parts that rely on things like >>> getPythonExceptionClass() - Not sure if that was intentional or not >>> (guessing it might be one of those things that are always defined >>> anyway), but just wanted to highlight it. >> >> >> Not needed, all of functions.cpp depends on python amyway. >> >>> Apart from that, it looks great - In order to assist with replication >>> of the through-exception issue on your side I also whipped up a (very) >>> quick example and am also attaching this to the email. Utilises a >>> noddy Makefile to get the job done, but running "make" or "make test" >>> will generate the java/jcc packages and then run the test. Running >>> "make uninstall" will remove the jcc package, and "make clean" will >>> tidy up the build artifacts. Standard stuff! >>> >>> Hope that helps! >> >> >> Excellent, thanks ! >> >> Andi.. >> >>> >>> Thanks, >>> Lee >>> >>>> Thanks ! >>>> >>>> Andi.. >>>> >>>> >>>>> >>>>> Thanks, >>>>> Lee >>>>> >>>>> >>>>> >>>>> On 4 July 2014 18:17, Andi Vajda <va...@apache.org> wrote: >>>>> >>>>>> >>>>>> On Jul 4, 2014, at 18:33, Lee Skillen <lskil...@vulcanft.com> wrote: >>>>>> >>>>>> Hi Andi, >>>>>> >>>>>> First of all, absolutely fantastic work on the JCC project, we love it >>>>>> - >>>>>> Second of all, apologies for contacting you out of the blue via email, >>>>>> but >>>>>> I wasn't sure of the best place to contact you to ask a few questions >>>>>> (technical-level, not support-level). If it isn't acceptable to email >>>>>> please let me know and I can move the discussion somewhere else. >>>>>> >>>>>> >>>>>> Yes, please include pylucene-dev@lucene.apache.org (subscription >>>>>> required. send mail pylucene-dev-subscribe@ and follow the >>>>>> instructions >>>>>> in the reply). >>>>>> >>>>>> A bit of background first of all :- >>>>>> >>>>>> We've been integrating JCC for the past two weeks on a sizeable >>>>>> Python/Java/C++ project (where Python is a controller/glue layer) >>>>>> having >>>>>> migrated from a previous solution to integrate across the languages, >>>>>> and >>>>>> for the most part JCC has been a lifesaver. >>>>>> >>>>>> Technically we've only hit one major issue so far and that has been >>>>>> the >>>>>> translation of exceptions through the layers. Our Java application >>>>>> does a >>>>>> lot of double-dispatch (visitation) type of execution and we're >>>>>> calling >>>>>> outwards to Python to implement the concrete visitation logic. >>>>>> >>>>>> In the case of an exception being thrown it results in JCC collating a >>>>>> traceback and captured exception at every level of scope, resulting in >>>>>> a >>>>>> giant JavaError exception which has lost it's original PythonException >>>>>> context. >>>>>> >>>>>> I was actually able to fix this by attempting to capture the full >>>>>> context >>>>>> via PyErr_Fetch() when the first python object is thrown and storing >>>>>> it >>>>>> within PythonException, and then ensuring that this is carried through >>>>>> each >>>>>> layer appropriately - This seems to work very well, although >>>>>> admittedly I >>>>>> haven't considered regressive errors or compatibility yet, but just >>>>>> wanted >>>>>> to proof-of-concept it first of all then speak with a JCC maintainer. >>>>>> >>>>>> The net result is being able to catch errors in a pipeline similar to >>>>>> the >>>>>> following: >>>>>> >>>>>> Python >>>>>> -> Java >>>>>> -> Python >>>>>> -> Raise MyException >>>>>> <- Throw PythonException (containing MyException) >>>>>> <- Inject MyException >>>>>> Catch MyException >>>>>> >>>>>> My main questions are to ask your thoughts about :- >>>>>> >>>>>> - What are your thoughts about through-layer exceptions and a feature >>>>>> improvement like this? >>>>>> >>>>>> >>>>>> I agree that losing information during exception throwing is not good. >>>>>> If >>>>>> you've got an improvement in the form of a patch, do not hesitate in >>>>>> sending it in. >>>>>> >>>>>> - JCC is (I think) currently at home with pylucene, are there are >>>>>> plans to >>>>>> making it separate? >>>>>> >>>>>> >>>>>> No such plans at the moment. >>>>>> >>>>>> - Are you happy if I create a public github project and add the >>>>>> patched >>>>>> code in there for review? >>>>>> >>>>>> >>>>>> You're welcome to fork the code if you'd like but you don't have to if >>>>>> all >>>>>> you want is sending a patch. Your call. >>>>>> >>>>>> Thank you for the kind words ! >>>>>> >>>>>> Andi.. >>>>>> >>>>>> >>>>>> Hope to hear from you! >>>>>> >>>>>> Cheers, >>>>>> Lee >>>>>> >>>>>> >>>>>> >>>>>> TL;DR; >>>>>> >>>>>> We would like to :- >>>>>> >>>>>> - Improve JCC's through-layer exception support. >>>>>> - Establish a github project for JCC alone. >>>>>> >>>>>> -- >>>>>> Lee Skillen >>>>>> >>>>>> Vulcan Financial Technologies >>>>>> 1st Floor, 47 Malone Road, Belfast, BT9 6RY >>>>>> >>>>>> Office: +44 (0)28 95 817888 >>>>>> Mobile: +44 (0)78 41 425152 >>>>>> Web: www.vulcanft.com >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> Lee Skillen >>>>> >>>>> Vulcan Financial Technologies >>>>> 1st Floor, 47 Malone Road, Belfast, BT9 6RY >>>>> >>>>> Office: +44 (0)28 95 817888 >>>>> Mobile: +44 (0)78 41 425152 >>>>> Web: www.vulcanft.com >>> >>> >>> >>> >>> >>> -- >>> Lee Skillen >>> >>> Vulcan Financial Technologies >>> 1st Floor, 47 Malone Road, Belfast, BT9 6RY >>> >>> Office: +44 (0)28 95 817888 >>> Web: www.vulcanft.com >>> > -- Lee Skillen Vulcan Financial Technologies 1st Floor, 47 Malone Road, Belfast, BT9 6RY Office: +44 (0)28 95 817888 Web: www.vulcanft.com
diff --git a/jcc/CHANGES b/jcc/CHANGES index fe89624..7d08a4f 100644 --- a/jcc/CHANGES +++ b/jcc/CHANGES @@ -1,3 +1,8 @@ +Version 2.20 -> +-------------------- + - improved through-layer Python error handling (with Lee Skillen) + - + Version 2.19 -> 2.20 -------------------- - added support for java varargs by reconstructing array from last parameters diff --git a/jcc/java/org/apache/jcc/PythonException.java b/jcc/java/org/apache/jcc/PythonException.java index 4eae86b..a41ce09 100644 --- a/jcc/java/org/apache/jcc/PythonException.java +++ b/jcc/java/org/apache/jcc/PythonException.java @@ -18,21 +18,29 @@ package org.apache.jcc; public class PythonException extends RuntimeException { public boolean withTrace = true; - protected String message, errorName, traceback; + protected long py_error_state = 0L; public PythonException(String message) { super(message); - getErrorInfo(); // sets errorName, message and traceback + saveErrorState(); + } + + public void finalize() + throws Throwable + { + pythonDecRef(); } public String getMessage(boolean trace) { - if (message == null) - message = super.getMessage(); + if (py_error_state == 0L) + return super.getMessage(); + + String message = getErrorMessage(); if (trace) - return message + "\n" + traceback; + return message + "\n" + getErrorTraceback(); return message; } @@ -42,16 +50,11 @@ public class PythonException extends RuntimeException { return getMessage(withTrace); } - public String getErrorName() - { - return errorName; - } + public native void pythonDecRef(); - public String getTraceback() - { - return traceback; - } + public native String getErrorName(); + public native String getErrorMessage(); + public native String getErrorTraceback(); - protected native void getErrorInfo(); - public native void clear(); + protected native void saveErrorState(); } diff --git a/jcc/jcc/sources/functions.cpp b/jcc/jcc/sources/functions.cpp index 7a85fdf..7dd442c 100644 --- a/jcc/jcc/sources/functions.cpp +++ b/jcc/jcc/sources/functions.cpp @@ -1355,10 +1355,42 @@ PyObject *PyErr_SetJavaError() { JNIEnv *vm_env = env->get_vm_env(); jthrowable throwable = vm_env->ExceptionOccurred(); - PyObject *err; vm_env->ExceptionClear(); - err = t_Throwable::wrap_Object(Throwable(throwable)); + + jclass pycls = env->getPythonExceptionClass(); + + // Support through-layer exceptions by taking the active PythonException + // and making the enclosed exception visible to Python again. + + if (vm_env->IsSameObject(vm_env->GetObjectClass(throwable), pycls)) + { + jfieldID fid = vm_env->GetFieldID(pycls, "py_error_state", "J"); + PyObject *state = (PyObject *) vm_env->GetLongField(throwable, fid); + + if (state != NULL) + { + PyObject *type = PyTuple_GET_ITEM(state, 0); + PyObject *value = PyTuple_GET_ITEM(state, 1); + PyObject *tb = PyTuple_GET_ITEM(state, 2); + + Py_INCREF(type); + if (value == Py_None) + value = NULL; + else + Py_INCREF(value); + if (tb == Py_None) + tb = NULL; + else + Py_INCREF(tb); + + PyErr_Restore(type, value, tb); + + return NULL; + } + } + + PyObject *err = t_Throwable::wrap_Object(Throwable(throwable)); PyErr_SetObject(PyExc_JavaError, err); Py_DECREF(err); diff --git a/jcc/jcc/sources/jcc.cpp b/jcc/jcc/sources/jcc.cpp index 0062bb5..defb32f 100644 --- a/jcc/jcc/sources/jcc.cpp +++ b/jcc/jcc/sources/jcc.cpp @@ -750,68 +750,160 @@ extern "C" { } }; -static void JNICALL _PythonException_getErrorInfo(JNIEnv *vm_env, jobject self) +static void JNICALL _PythonException_pythonDecRef(JNIEnv *vm_env, jobject self) +{ + jclass jcls = vm_env->GetObjectClass(self); + jfieldID fid = vm_env->GetFieldID(jcls, "py_error_state", "J"); + PyObject *state = (PyObject *) vm_env->GetLongField(self, fid); + + if (state != NULL) + { + PythonGIL gil(vm_env); + + Py_DECREF(state); + vm_env->SetLongField(self, fid, (jlong) 0); + } +} + +static void JNICALL _PythonException_saveErrorState(JNIEnv *vm_env, + jobject self) { PythonGIL gil(vm_env); + PyObject *type, *value, *tb; + + PyErr_Fetch(&type, &value, &tb); + + if (type != NULL) + { + PyObject *state = PyTuple_New(3); + + PyErr_NormalizeException(&type, &value, &tb); + PyTuple_SET_ITEM(state, 0, type); + if (value == NULL) + { + PyTuple_SET_ITEM(state, 1, Py_None); + Py_INCREF(Py_None); + } + else + PyTuple_SET_ITEM(state, 1, value); + if (tb == NULL) + { + PyTuple_SET_ITEM(state, 2, Py_None); + Py_INCREF(Py_None); + } + else + PyTuple_SET_ITEM(state, 2, tb); + + jclass jcls = vm_env->GetObjectClass(self); + jfieldID fid = vm_env->GetFieldID(jcls, "py_error_state", "J"); - if (!PyErr_Occurred()) - return; + vm_env->SetLongField(self, fid, (jlong) state); + } +} - PyObject *type, *value, *tb, *errorName; +static jstring JNICALL _PythonException_getErrorName(JNIEnv *vm_env, + jobject self) +{ jclass jcls = vm_env->GetObjectClass(self); + jfieldID fid = vm_env->GetFieldID(jcls, "py_error_state", "J"); + PyObject *state = (PyObject *) vm_env->GetLongField(self, fid); - PyErr_Fetch(&type, &value, &tb); + if (state == NULL) + return NULL; + + PythonGIL gil(vm_env); + PyObject *errorName = + PyObject_GetAttrString(PyTuple_GET_ITEM(state, 0), "__name__"); - errorName = PyObject_GetAttrString(type, "__name__"); if (errorName != NULL) { - jfieldID fid = - vm_env->GetFieldID(jcls, "errorName", "Ljava/lang/String;"); jstring str = env->fromPyString(errorName); - - vm_env->SetObjectField(self, fid, str); - vm_env->DeleteLocalRef(str); Py_DECREF(errorName); + + return str; } - if (value != NULL) + return NULL; +} + +static jstring JNICALL _PythonException_getErrorMessage(JNIEnv *vm_env, + jobject self) +{ + jclass jcls = vm_env->GetObjectClass(self); + jfieldID fid = vm_env->GetFieldID(jcls, "py_error_state", "J"); + PyObject *state = (PyObject *) vm_env->GetLongField(self, fid); + + if (state == NULL) + return NULL; + + PythonGIL gil(vm_env); + PyObject *value = PyTuple_GET_ITEM(state, 1); + + if (value != Py_None) { PyObject *message = PyObject_Str(value); if (message != NULL) { - jfieldID fid = - vm_env->GetFieldID(jcls, "message", "Ljava/lang/String;"); jstring str = env->fromPyString(message); - - vm_env->SetObjectField(self, fid, str); - vm_env->DeleteLocalRef(str); Py_DECREF(message); + + return str; } } + return NULL; +} + +static jstring JNICALL _PythonException_getErrorTraceback(JNIEnv *vm_env, + jobject self) +{ + jclass jcls = vm_env->GetObjectClass(self); + jfieldID fid = vm_env->GetFieldID(jcls, "py_error_state", "J"); + PyObject *state = (PyObject *) vm_env->GetLongField(self, fid); + + if (state == NULL) + return NULL; + + PythonGIL gil(vm_env); PyObject *module = NULL, *cls = NULL, *stringIO = NULL, *result = NULL; PyObject *_stderr = PySys_GetObject("stderr"); + if (!_stderr) - goto err; + return NULL; module = PyImport_ImportModule("cStringIO"); if (!module) - goto err; + return NULL; cls = PyObject_GetAttrString(module, "StringIO"); Py_DECREF(module); if (!cls) - goto err; + return NULL; stringIO = PyObject_CallObject(cls, NULL); Py_DECREF(cls); if (!stringIO) - goto err; + return NULL; Py_INCREF(_stderr); PySys_SetObject("stderr", stringIO); + PyObject *type = PyTuple_GET_ITEM(state, 0); + PyObject *value = PyTuple_GET_ITEM(state, 1); + PyObject *tb = PyTuple_GET_ITEM(state, 2); + jstring str = NULL; + + Py_INCREF(type); + if (value == Py_None) + value = NULL; + else + Py_INCREF(value); + if (tb == Py_None) + tb = NULL; + else + Py_INCREF(tb); + PyErr_Restore(type, value, tb); PyErr_Print(); @@ -820,39 +912,31 @@ static void JNICALL _PythonException_getErrorInfo(JNIEnv *vm_env, jobject self) if (result != NULL) { - jfieldID fid = - vm_env->GetFieldID(jcls, "traceback", "Ljava/lang/String;"); - jstring str = env->fromPyString(result); - - vm_env->SetObjectField(self, fid, str); - vm_env->DeleteLocalRef(str); + str = env->fromPyString(result); Py_DECREF(result); } PySys_SetObject("stderr", _stderr); Py_DECREF(_stderr); - return; - - err: - PyErr_Restore(type, value, tb); -} - -static void JNICALL _PythonException_clear(JNIEnv *vm_env, jobject self) -{ - PythonGIL gil(vm_env); - PyErr_Clear(); + return str; } static void registerNatives(JNIEnv *vm_env) { jclass cls = vm_env->FindClass("org/apache/jcc/PythonException"); JNINativeMethod methods[] = { - { "getErrorInfo", "()V", (void *) _PythonException_getErrorInfo }, - { "clear", "()V", (void *) _PythonException_clear }, + { "pythonDecRef", "()V", (void *) _PythonException_pythonDecRef }, + { "saveErrorState", "()V", (void *) _PythonException_saveErrorState }, + { "getErrorName", "()Ljava/lang/String;", + (void *) _PythonException_getErrorName }, + { "getErrorMessage", "()Ljava/lang/String;", + (void *) _PythonException_getErrorMessage }, + { "getErrorTraceback", "()Ljava/lang/String;", + (void *) _PythonException_getErrorTraceback }, }; - vm_env->RegisterNatives(cls, methods, 2); + vm_env->RegisterNatives(cls, methods, sizeof(methods) / sizeof(methods[0])); } #endif /* _jcc_lib */ diff --git a/test/test_PythonException.py b/test/test_PythonException.py new file mode 100644 index 0000000..343df62 --- /dev/null +++ b/test/test_PythonException.py @@ -0,0 +1,50 @@ +# ==================================================================== +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ==================================================================== + +import sys, lucene, unittest +from PyLuceneTestCase import PyLuceneTestCase + +from org.apache.lucene.analysis.standard import StandardAnalyzer +from org.apache.lucene.util import Version +from org.apache.pylucene.queryparser.classic import \ + PythonQueryParser, PythonMultiFieldQueryParser + + +class PythonExceptionTestCase(PyLuceneTestCase): + def testThroughLayerException(self): + class TestException(Exception): + pass + + class TestQueryParser(PythonQueryParser): + def getFieldQuery_quoted(_self, field, queryText, quoted): + raise TestException("TestException") + + qp = TestQueryParser(Version.LUCENE_CURRENT, 'all', + StandardAnalyzer(Version.LUCENE_CURRENT)) + + with self.assertRaises(TestException): + qp.parse("foo bar") + + +if __name__ == "__main__": + lucene.initVM(vmargs=['-Djava.awt.headless=true']) + if '-loop' in sys.argv: + sys.argv.remove('-loop') + while True: + try: + unittest.main() + except: + pass + else: + unittest.main()