The script must have thought about it somehow :-) Have a great,
undisturbed vacation!

roman

On Thu, Jul 19, 2012 at 9:33 AM, Andi Vajda <va...@apache.org> wrote:
>
> On Fri, 13 Jul 2012, Roman Chyla wrote:
>
>> Hi,
>> I was playing with the idea of creating virtual packages, attached is a
>> working script that illustrates it. I am getting this output:
>>
>> Dit it work?
>
>
> No, I haven't forgotten, I'm just on vacation.
>
> Andi..
>
>
>> ==================
>> from org.apache.lucene.search import SearcherFactory; print
>> SearcherFactory
>> <type 'SearcherFactory'>
>> from org.apache.lucene.analysis import Analyzer as Banalyzer; print
>> Banalyzer
>> <type 'Analyzer'>
>> print sys.modules['org'] <module 'org' (built-in)>
>> print sys.modules['org.apache'] <module 'org.apache' (built-in)>
>> print sys.modules['org.apache.lucene'] <module 'org.apache.lucene'
>> (built-in)>
>> print sys.modules['org.apache.lucene.search'] <module
>> 'org.apache.lucene.search' (built-in)>
>>
>> Cheers,
>>
>>  roman
>>
>>
>> On Fri, Jul 13, 2012 at 1:34 PM, Andi Vajda <va...@apache.org> wrote:
>>
>>>
>>> On Jul 13, 2012, at 18:33, Roman Chyla <roman.ch...@gmail.com> wrote:
>>>
>>>> I think this would be great. Let me add little bit more to your
>>>> observations (whole night yesterday was spent fighting with renames -
>>>> because I was building a project which imports shared lucene and solr
>>>> --
>>>> there were thousands of same classes, I am not sure it would be possible
>>>> without some sort of a flexible rename...)
>>>>
>>>> JCC is a great tool and is used by potentially many projects - so
>>>
>>> stripping
>>>>
>>>> "org.apache" seems right for pylucene, but looks arbitrary otherwise
>>>
>>>
>>> Yes, I forgot to say that there would be a way to declare one or more
>>> mappings  so that org.apache.lucene becomes lucene.
>>>
>>> Andi..
>>>
>>>> (unless there is a flexible stripping mechanism). Also, if the full
>>>> namespace remains original, then the code written in Python would be
>>>> also
>>>> executable by Jython, which is IMHO an advantage.
>>>>
>>>> But this being Python, the packages cannot be spread in different
>>>
>>> locations
>>>>
>>>> (ie. there can be only one org.apache.lucene.analysis package) - unless
>>>> there exists (again) some flexible mechanism which populates the
>>>
>>> namespace
>>>>
>>>> with objects that belong there. It may seem an overkill to you, because
>>>
>>> for
>>>>
>>>> single projects it would work, but seems perfectly justifiable in case
>>>> of
>>>> imported shared libraries
>>>>
>>>> I don't know what is your idea for implementing the python packages, but
>>>> your last email got me thinking as well - there might be a very simple
>>>
>>> way
>>>>
>>>> of getting to the java packages inside Python without too much work.
>>>>
>>>> Let's say the java "org.apache.lucene.search.IndexSearcher" is known to
>>>> python as org_apache_lucene_search_IndexSearcher
>>>>
>>>> and users do:
>>>>
>>>> import lucene
>>>> lucene.initVM()
>>>>
>>>> initVM() first initiates java VM (and populates the lucene namespace
>>>> with
>>>> all objects), but then it will call jcc.register_module(self)
>>>>
>>>> A new piece of code inside JCC grabs the lucene module and creates (on
>>>
>>> the
>>>>
>>>> fly) python packages -- using types.ModuleType (or new.module()) -- the
>>>
>>> new
>>>>
>>>> packages will be inserted into sys.modules
>>>>
>>>> so after lucene.initVM() returns
>>>>
>>>> users can do "from org.apache.lucene.search import IndexSearcher" and
>>>> get
>>>> lucene.org_apache_lucene_search_IndexSearcher object
>>>>
>>>> and also, when shared libraries are present (let's say 'solr') users do:
>>>>
>>>> import solr
>>>> solr.initVM()
>>>>
>>>> The JCC will just update the existing packages and create new ones if
>>>> needed (and from this perspective, having fully qualified name is safer
>>>> than to have lucene.search.IndexSearcher)
>>>>
>>>> I think this change is totally possible and will not change the way how
>>>> extensions are built. Does it have some serious flaw?
>>>>
>>>> I would be of course more than happy to contribute and test.
>>>>
>>>> Best,
>>>>
>>>>  roman
>>>>
>>>>
>>>> On Fri, Jul 13, 2012 at 11:47 AM, Andi Vajda <va...@apache.org> wrote:
>>>>
>>>>>
>>>>> On Tue, 10 Jul 2012, Andi Vajda wrote:
>>>>>
>>>>> I would also like to propose a change, to allow for more flexible
>>>>>>>
>>>>>>> mechanism of generating Python class names. The patch doesn't change
>>>>>>> the default pylucene behaviour, but it gives people a way to replace
>>>>>>> class names with patterns. I have noticed that there are more
>>>>>>> same-name classes from different packages in the new lucene (and it
>>>>>>> becomes worse when one has to deal with both lucene and solr).
>>>>>>>
>>>>>>
>>>>>> Another way to fix this is to reproduce the namespace hierarchy used
>>>>>> in
>>>>>> Lucene, following along the Java packages, something I've been
>>>
>>> dreading to
>>>>>>
>>>>>> do. Lucene just loooooves a really long deeply nested class structure.
>>>>>> I'm not convinced yet it is bad enough to go down that route, though.
>>>>>>
>>>>>> Your proposal to use patterns may in fact yield a much more convenient
>>>>>> solution. Thanks !
>>>>>>
>>>>>
>>>>> Rethinking this a bit, I'm prepared to change my mind on this. Your
>>>>> patterned rename patch shows that we're slowly but surely reaching the
>>>>> limit of the current setup that consists in throwing all wrapped
>>>>> classes
>>>>> under the one global 'lucene' namespace.
>>>>>
>>>>> Lucene 4.0 has seen a large number of deeply nested classes with
>>>>> similar
>>>>> names added since 3.x. Renaming these one by one (or excluding some)
>>>>> doesn't scale. Using the proposed patterned rename scales more but
>>>
>>> makes it
>>>>>
>>>>> difficult to know what got renamed and how.
>>>>> Ultimately, the more classes that are like-named, the more classes
>>>>> would
>>>>> have instable names from one release to the next as more duplicated
>>>
>>> names
>>>>>
>>>>> are encountered.
>>>>>
>>>>> What if instead JCC supported the original Java namespaces all the way
>>>
>>> to
>>>>>
>>>>> the Python inteface (still dropping the original 'org.apache' Java
>>>
>>> package
>>>>>
>>>>> tree prefix) ?
>>>>> The world-rooted style of naming Java classes isn't Pythonic but using
>>>
>>> the
>>>>>
>>>>> second half of the package structure feels right at home in the Python
>>>>> world.
>>>>>
>>>>> JCC already re-creates the complete Java package structure in C++ as
>>>>> namespaces for all the C++ code it generates, for both the JNI wrapper
>>>>> classes and the C++/Python types. It's only the installation of the
>>>
>>> class
>>>>>
>>>>> names into the Python VM that is done in the flat 'lucene' namespace.
>>>>>
>>>>> I think it shouldn't be too hard to change the code that installs
>>>
>>> classes
>>>>>
>>>>> to create sub-modules of the lucene module and install classes in these
>>>>> submodules instead (down to however many levels are in the original).
>>>>>
>>>>> In other words:
>>>>>  - from lucene import Document
>>>>> would become
>>>>>  - from lucene.document import Document
>>>>>
>>>>> One could of course also say:
>>>>>  - import lucene.document.Document as whateverOneLikes
>>>>>
>>>>> If that proposal isn't mortally flawed somewhere, I'm prepared to drop
>>>>> support for --rename and replace it with this new Python class/module
>>>>> layout.
>>>>>
>>>>> Since this is being talked about in the context of a major PyLucene
>>>>> release, version 4.0, and that all tests/samples have to be reworked
>>>>> anyway, this backwards compat break shouldn't be too controversial,
>>>>> hopefully.
>>>>>
>>>>> If it is, the old --rename could be preserved for sure, but I'd prefer
>>>>> simplying the JCC interface than to accrete more to it.
>>>>>
>>>>> What do you think ?
>>>>>
>>>>> Andi..
>>>>>
>>>>>
>>>>>> Andi..
>>>>>>
>>>>>>
>>>>>>> I can confirm the test_test_BinaryDocument.py crashes the JVM no
>>>>>>> more.
>>>>>>>
>>>>>>> Roman
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Jul 10, 2012 at 8:54 AM, Andi Vajda <va...@apache.org> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>> Hi Roman,
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, 9 Jul 2012, Roman Chyla wrote:
>>>>>>>>
>>>>>>>> Thanks, I am attaching a new patch that adds the missing test base.
>>>>>>>>>
>>>>>>>>> Sorry for the tabs, I was probably messing around with a few
>>>>>>>>> editors
>>>>>>>>> (some of them not configured properly)
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> I integrated your test class (renaming it to fit the naming scheme
>>>>>>>> used).
>>>>>>>> Thanks !
>>>>>>>>
>>>>>>>>
>>>>>>>> So far, found one serious problem, crashes VM -- see. eg
>>>>>>>>>>>>>
>>>>>>>>>>>>> test/test_BinaryDocument.py - when getting the document using:
>>>>>>>>>>>>> reader.document(0)
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>
>>>>>>>> test/test_BInaryDocument.py doesn't seem to crash the VM but fails
>>>>>>>> because
>>>>>>>> of some API changes. I suspect the crash to be some issue related to
>>>>>>>> using
>>>>>>>> an older jcc.
>>>>>>>>
>>>>>>>> I see a comment saying: "couldn't find any combination with
>>>>>>>> lucene4.0
>>>>>>>> where
>>>>>>>> it would raise errors". Most of these unit tests are straight ports
>>>>>>>> from the
>>>>>>>> original Java version. If you're stumped about a change, check the
>>>>>>>> original
>>>>>>>> Java test, it may have changed too.
>>>>>>>>
>>>>>>>> Andi..
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>
>>
>

Reply via email to