The script must have thought about it somehow :-) Have a great, undisturbed vacation!
roman On Thu, Jul 19, 2012 at 9:33 AM, Andi Vajda <va...@apache.org> wrote: > > On Fri, 13 Jul 2012, Roman Chyla wrote: > >> Hi, >> I was playing with the idea of creating virtual packages, attached is a >> working script that illustrates it. I am getting this output: >> >> Dit it work? > > > No, I haven't forgotten, I'm just on vacation. > > Andi.. > > >> ================== >> from org.apache.lucene.search import SearcherFactory; print >> SearcherFactory >> <type 'SearcherFactory'> >> from org.apache.lucene.analysis import Analyzer as Banalyzer; print >> Banalyzer >> <type 'Analyzer'> >> print sys.modules['org'] <module 'org' (built-in)> >> print sys.modules['org.apache'] <module 'org.apache' (built-in)> >> print sys.modules['org.apache.lucene'] <module 'org.apache.lucene' >> (built-in)> >> print sys.modules['org.apache.lucene.search'] <module >> 'org.apache.lucene.search' (built-in)> >> >> Cheers, >> >> roman >> >> >> On Fri, Jul 13, 2012 at 1:34 PM, Andi Vajda <va...@apache.org> wrote: >> >>> >>> On Jul 13, 2012, at 18:33, Roman Chyla <roman.ch...@gmail.com> wrote: >>> >>>> I think this would be great. Let me add little bit more to your >>>> observations (whole night yesterday was spent fighting with renames - >>>> because I was building a project which imports shared lucene and solr >>>> -- >>>> there were thousands of same classes, I am not sure it would be possible >>>> without some sort of a flexible rename...) >>>> >>>> JCC is a great tool and is used by potentially many projects - so >>> >>> stripping >>>> >>>> "org.apache" seems right for pylucene, but looks arbitrary otherwise >>> >>> >>> Yes, I forgot to say that there would be a way to declare one or more >>> mappings so that org.apache.lucene becomes lucene. >>> >>> Andi.. >>> >>>> (unless there is a flexible stripping mechanism). Also, if the full >>>> namespace remains original, then the code written in Python would be >>>> also >>>> executable by Jython, which is IMHO an advantage. >>>> >>>> But this being Python, the packages cannot be spread in different >>> >>> locations >>>> >>>> (ie. there can be only one org.apache.lucene.analysis package) - unless >>>> there exists (again) some flexible mechanism which populates the >>> >>> namespace >>>> >>>> with objects that belong there. It may seem an overkill to you, because >>> >>> for >>>> >>>> single projects it would work, but seems perfectly justifiable in case >>>> of >>>> imported shared libraries >>>> >>>> I don't know what is your idea for implementing the python packages, but >>>> your last email got me thinking as well - there might be a very simple >>> >>> way >>>> >>>> of getting to the java packages inside Python without too much work. >>>> >>>> Let's say the java "org.apache.lucene.search.IndexSearcher" is known to >>>> python as org_apache_lucene_search_IndexSearcher >>>> >>>> and users do: >>>> >>>> import lucene >>>> lucene.initVM() >>>> >>>> initVM() first initiates java VM (and populates the lucene namespace >>>> with >>>> all objects), but then it will call jcc.register_module(self) >>>> >>>> A new piece of code inside JCC grabs the lucene module and creates (on >>> >>> the >>>> >>>> fly) python packages -- using types.ModuleType (or new.module()) -- the >>> >>> new >>>> >>>> packages will be inserted into sys.modules >>>> >>>> so after lucene.initVM() returns >>>> >>>> users can do "from org.apache.lucene.search import IndexSearcher" and >>>> get >>>> lucene.org_apache_lucene_search_IndexSearcher object >>>> >>>> and also, when shared libraries are present (let's say 'solr') users do: >>>> >>>> import solr >>>> solr.initVM() >>>> >>>> The JCC will just update the existing packages and create new ones if >>>> needed (and from this perspective, having fully qualified name is safer >>>> than to have lucene.search.IndexSearcher) >>>> >>>> I think this change is totally possible and will not change the way how >>>> extensions are built. Does it have some serious flaw? >>>> >>>> I would be of course more than happy to contribute and test. >>>> >>>> Best, >>>> >>>> roman >>>> >>>> >>>> On Fri, Jul 13, 2012 at 11:47 AM, Andi Vajda <va...@apache.org> wrote: >>>> >>>>> >>>>> On Tue, 10 Jul 2012, Andi Vajda wrote: >>>>> >>>>> I would also like to propose a change, to allow for more flexible >>>>>>> >>>>>>> mechanism of generating Python class names. The patch doesn't change >>>>>>> the default pylucene behaviour, but it gives people a way to replace >>>>>>> class names with patterns. I have noticed that there are more >>>>>>> same-name classes from different packages in the new lucene (and it >>>>>>> becomes worse when one has to deal with both lucene and solr). >>>>>>> >>>>>> >>>>>> Another way to fix this is to reproduce the namespace hierarchy used >>>>>> in >>>>>> Lucene, following along the Java packages, something I've been >>> >>> dreading to >>>>>> >>>>>> do. Lucene just loooooves a really long deeply nested class structure. >>>>>> I'm not convinced yet it is bad enough to go down that route, though. >>>>>> >>>>>> Your proposal to use patterns may in fact yield a much more convenient >>>>>> solution. Thanks ! >>>>>> >>>>> >>>>> Rethinking this a bit, I'm prepared to change my mind on this. Your >>>>> patterned rename patch shows that we're slowly but surely reaching the >>>>> limit of the current setup that consists in throwing all wrapped >>>>> classes >>>>> under the one global 'lucene' namespace. >>>>> >>>>> Lucene 4.0 has seen a large number of deeply nested classes with >>>>> similar >>>>> names added since 3.x. Renaming these one by one (or excluding some) >>>>> doesn't scale. Using the proposed patterned rename scales more but >>> >>> makes it >>>>> >>>>> difficult to know what got renamed and how. >>>>> Ultimately, the more classes that are like-named, the more classes >>>>> would >>>>> have instable names from one release to the next as more duplicated >>> >>> names >>>>> >>>>> are encountered. >>>>> >>>>> What if instead JCC supported the original Java namespaces all the way >>> >>> to >>>>> >>>>> the Python inteface (still dropping the original 'org.apache' Java >>> >>> package >>>>> >>>>> tree prefix) ? >>>>> The world-rooted style of naming Java classes isn't Pythonic but using >>> >>> the >>>>> >>>>> second half of the package structure feels right at home in the Python >>>>> world. >>>>> >>>>> JCC already re-creates the complete Java package structure in C++ as >>>>> namespaces for all the C++ code it generates, for both the JNI wrapper >>>>> classes and the C++/Python types. It's only the installation of the >>> >>> class >>>>> >>>>> names into the Python VM that is done in the flat 'lucene' namespace. >>>>> >>>>> I think it shouldn't be too hard to change the code that installs >>> >>> classes >>>>> >>>>> to create sub-modules of the lucene module and install classes in these >>>>> submodules instead (down to however many levels are in the original). >>>>> >>>>> In other words: >>>>> - from lucene import Document >>>>> would become >>>>> - from lucene.document import Document >>>>> >>>>> One could of course also say: >>>>> - import lucene.document.Document as whateverOneLikes >>>>> >>>>> If that proposal isn't mortally flawed somewhere, I'm prepared to drop >>>>> support for --rename and replace it with this new Python class/module >>>>> layout. >>>>> >>>>> Since this is being talked about in the context of a major PyLucene >>>>> release, version 4.0, and that all tests/samples have to be reworked >>>>> anyway, this backwards compat break shouldn't be too controversial, >>>>> hopefully. >>>>> >>>>> If it is, the old --rename could be preserved for sure, but I'd prefer >>>>> simplying the JCC interface than to accrete more to it. >>>>> >>>>> What do you think ? >>>>> >>>>> Andi.. >>>>> >>>>> >>>>>> Andi.. >>>>>> >>>>>> >>>>>>> I can confirm the test_test_BinaryDocument.py crashes the JVM no >>>>>>> more. >>>>>>> >>>>>>> Roman >>>>>>> >>>>>>> >>>>>>> On Tue, Jul 10, 2012 at 8:54 AM, Andi Vajda <va...@apache.org> wrote: >>>>>>> >>>>>>>> >>>>>>>> Hi Roman, >>>>>>>> >>>>>>>> >>>>>>>> On Mon, 9 Jul 2012, Roman Chyla wrote: >>>>>>>> >>>>>>>> Thanks, I am attaching a new patch that adds the missing test base. >>>>>>>>> >>>>>>>>> Sorry for the tabs, I was probably messing around with a few >>>>>>>>> editors >>>>>>>>> (some of them not configured properly) >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> I integrated your test class (renaming it to fit the naming scheme >>>>>>>> used). >>>>>>>> Thanks ! >>>>>>>> >>>>>>>> >>>>>>>> So far, found one serious problem, crashes VM -- see. eg >>>>>>>>>>>>> >>>>>>>>>>>>> test/test_BinaryDocument.py - when getting the document using: >>>>>>>>>>>>> reader.document(0) >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>> >>>>>>>> test/test_BInaryDocument.py doesn't seem to crash the VM but fails >>>>>>>> because >>>>>>>> of some API changes. I suspect the crash to be some issue related to >>>>>>>> using >>>>>>>> an older jcc. >>>>>>>> >>>>>>>> I see a comment saying: "couldn't find any combination with >>>>>>>> lucene4.0 >>>>>>>> where >>>>>>>> it would raise errors". Most of these unit tests are straight ports >>>>>>>> from the >>>>>>>> original Java version. If you're stumped about a change, check the >>>>>>>> original >>>>>>>> Java test, it may have changed too. >>>>>>>> >>>>>>>> Andi.. >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>> >> >