[Zope-dev] "stemmed and stopped": problems with stopwords and the 'and' operator
OK, so the TextIndex of a ZCatalog says that it "stems and stops" the words before indexing them and, one would hope, before searching for them. I always thought that "stem" meant "derive the stem of the word" (so as to make the index smaller). I just peeked at the Splitter.c source code for the first time, and that sure ain't it. The American phrase would be "truncate and stop", I think. In any case, "stem" in the source code comments means truncate at MAX_WORD, which is 64 characters. That's an aside. Now, about stopping. There is a list of "stop words" that don't get indexed. Fine. I'm having quite a bit of trouble figuring out exactly where this is happening, but let's ignore that for now on the indexing side. It happens, that's enough for now. Now, what happens to stop words in an input search string? >From my single stepping the code, stopwords are still in the query string while it is being parsed, and get looked up in the index. So, here is the heart of my problem: consider the search string someword and someotherword Suppose 'someword' is a stopword. It doesn't get indexed because it is considered too common. Now, I would think that if this search string is submitted, the result would be to return the hits for 'someotherword'. This might, however, not be other people's opinions. So, is the fact that TextIndex appears to return the null set in this case a bug or a feature? I say 'appears' because I actually get 2 (out of about 2000 with the keyword 'car') hits in my database when I search on 'car and the'. I tried to single step through the logic using the debugger, but when the call is made made to the splitter with the stopword passed in, python core dumps. I can do 'from SearchIndex.Splitter import Splitter', and call Splitter, and see that stopwords are not removed, but I can't do 'from SearchIndex.UnTextIndex import Splitter' because it complains about not being able to import Persistent from Persistence. (*That* problem was reported by someone else in another context not too long ago.) However, it's pretty clear that this null set return is what is happening, since when the evaluate subroutine is entered, the stop word is in the partially parsed string, and is in fact passed to the Splitter in the __getitem__ of the text index. If the splitter stopped it, the returned result set would be None, If the splitter doesn't stop it, the text index is still going return a null set as the result for that word, since it doesn't appear in the index by definition. An 'and' of any result set with None is going to be the null set. So it looks like the thing was designed this way: the stop words get "deleted" from the search string by not being in the index and by therefore returning null sets when looked up. This works fine for 'or' logic, but not for 'and' logic, IMO. Contrary opinions? Helpful hints? If I'm right and this needs fixed, it's going to be a bit of a bear to do, I think. (Where those two hits are coming from is a *real* mystery, but one I'm going to ignore for a little while yet since I can't yet get the debugger to work for me without crashing. I have a sneaking suspicion it is related to my confusion about where stopwords get removed in the indexing process, but it will probably take a while for me to prove or disprove that notion.) --RDM ___ Zope-Dev maillist - [EMAIL PROTECTED] http://lists.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] ZClass properties and DTML
Hi, > Carl Robitaille writes: > > Is there a way to > > include DTML tags in properties? If not, what are my other options? > Not that I know of. > > Make your ZClass folderich (derive from Folder) and > use DTML methods instead of properties. Thanks for the tip. It's not what I had in mind, but it's working now. Isn't it what's important ;-) Carl ___ Zope-Dev maillist - [EMAIL PROTECTED] http://lists.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] ZClass properties and DTML
Carl Robitaille writes: > Is there a way to > include DTML tags in properties? If not, what are my other options? Not that I know of. Make your ZClass folderich (derive from Folder) and use DTML methods instead of properties. Dieter ___ Zope-Dev maillist - [EMAIL PROTECTED] http://lists.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] Caching problems
that property would be set by calling... self._p_changed = 1 right after you changed anything in that list. jens on 8/16/00 9:37, R. David Murray at [EMAIL PROTECTED] wrote: > On Wed, 16 Aug 2000, Bob Pepin wrote: >> attribute of the class. When I append something to that list, it stays >> there at first, but only until I restart Zope. It disappears (==is set > > You have to let Zope know that the object has been modified so > it knows to commit it to disk. > > x = self.list > x.append('something') > self.list = x > > will do that. Hmm. Actually I suppose that > > self.list.append('something') > self.list = self.list > > would also work. (The point is for __setattr__ to get called so > Zope can notice that the object has changed). > > There's also a property you can set on self to notify zope of the > modification, but I forget it's name. > ___ Zope-Dev maillist - [EMAIL PROTECTED] http://lists.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] Caching problems
On Wed, 16 Aug 2000, Bob Pepin wrote: > attribute of the class. When I append something to that list, it stays > there at first, but only until I restart Zope. It disappears (==is set You have to let Zope know that the object has been modified so it knows to commit it to disk. x = self.list x.append('something') self.list = x will do that. Hmm. Actually I suppose that self.list.append('something') self.list = self.list would also work. (The point is for __setattr__ to get called so Zope can notice that the object has changed). There's also a property you can set on self to notify zope of the modification, but I forget it's name. --RDM ___ Zope-Dev maillist - [EMAIL PROTECTED] http://lists.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] Caching problems
You directly change a nonpersistence participant object. As stated in http://www.python.org/workshops/2000-01/proceedings/papers/fulton/zodb3.html : All sub-objects of persistent objects must be persistent or immutable. This rule is necessary because, without it, the persistence system would not be notified of persistent object state changes. Like most rules, this rule can be broken with care, as is done in the issue tracking system. A persistent object can use mutable non-persistent sub-objects if it notifies the persistence system that the sub-object has changed. It can do this in two ways. It can notify the persistence system directly by assigning a true value to the attribute _p_changed so what I think you need is to tell the object wich has the list property that it has changed, coz he can't know it by himself if you list.append or list.extend or even list[0] = something. To do this you can: 1- set self._p_changed = 1 on the object after the change (append or something) 2 - assign self.list to itself so that the object knows a change has been made to that property. but mainly... RTFM ;-) On Wed, 16 Aug 2000, Bob Pepin wrote: > > Hi, > I have a problem with a class I wrote where I have a list as an > attribute of the class. When I append something to that list, it stays > there at first, but only until I restart Zope. It disappears (==is set > to the value I assigned to it in __init__) and reappears as well when > I hit reload a few times very quickly. Whenever I flush the cache it > disappears immediately. There seems to be no transaction registered by > Zope, because it doesn't show up under 'Undo'. I observed this both > thru a dtml page and a debugging function written in python. > > I attached the code below, the method and attributes I'm talking about > are IEEShare.read_access_roles, IEEShare.write_access_roles and > IEEShare.add_user_access() > > The problem exists with both Zope 2.2.0 and 2.2.1b1. I'm running > 2.2.1b1 right now on a SuSE Linux 6.4 default installation. (standard libc, > threads etc.) > Both versions of Zope are compiled from source. > -- "Sometimes I think the surest sign that intelligent life exists elsewhere in the Universe is that none of it has tried to contact us." Carlos Neves [EMAIL PROTECTED] ___ Zope-Dev maillist - [EMAIL PROTECTED] http://lists.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope )
[Zope-dev] Funny bug in Zope?
I'm not quite sure if this is already reported or not, so I won't give much information on it (I've looked for something similar, but I haven't found it). I'm runnig Zope 2.2.0 under Linux. I have this product I've made myself - called TravelAgent. I add an instance of it to the Zope root-folder, which I call TravelAgent. Everything works fine. If I rename this instance to something that has less than five characters in it, something very fish starts to happen. I should also mention that in the index_html method's body-tag I have an background-element which displays somepic.gif. Now, if the instance is called f ex Trav (ie. less that five characters) and I try to render the index_html, it displays the background image correctly, but it also pukes out the gif in ASCII on top - and nothing else. This happens very consistently, unless I reload fast four, five times - then it display the whole index_html just fine. I rename back to Trave (ie. five characters or more) and everything is fine. I can reproduce this all the time. Strange, huh? ___ Zope-Dev maillist - [EMAIL PROTECTED] http://lists.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope )
[Zope-dev] Caching problems
Hi, I have a problem with a class I wrote where I have a list as an attribute of the class. When I append something to that list, it stays there at first, but only until I restart Zope. It disappears (==is set to the value I assigned to it in __init__) and reappears as well when I hit reload a few times very quickly. Whenever I flush the cache it disappears immediately. There seems to be no transaction registered by Zope, because it doesn't show up under 'Undo'. I observed this both thru a dtml page and a debugging function written in python. I attached the code below, the method and attributes I'm talking about are IEEShare.read_access_roles, IEEShare.write_access_roles and IEEShare.add_user_access() The problem exists with both Zope 2.2.0 and 2.2.1b1. I'm running 2.2.1b1 right now on a SuSE Linux 6.4 default installation. (standard libc, threads etc.) Both versions of Zope are compiled from source. __doc__ = """IEEFolder product module.""" __version__ = '0.1' import string from Globals import HTMLFile,MessageDialog,Persistent import OFS.Folder import OFS.PropertyManager import Acquisition import AccessControl from AccessControl import getSecurityManager manage_addIEEFolderForm = HTMLFile('ieefolderAdd', globals()) def manage_addIEEFolder(self, id, title=None, REQUEST=None): """Add an IEE Folder to a folder.""" ob=IEEFolder() ob.id=str(id) if title: ob.title=title self._setObject(id, ob) if REQUEST is not None: return self.manage_main(self, REQUEST, update_menu=1) def findProperty(ids, props, searchterm, path='', all=0): """Find a property """ result=[] checkPermission=getSecurityManager().checkPermission for obj in ids: if hasattr(obj, '_properties') and checkPermission('Access contents information', obj): for md in getattr(obj, '_properties'): propid=md['id'] if (all or (propid in props)) and \ (string.find(str(getattr(obj, propid)),searchterm) != -1): result.append({'object': obj, 'id': path + obj.id, 'url': obj.absolute_url()}) if getattr(obj, 'isPrincipiaFolderish', None): result.extend(findProperty(obj.objectValues(), props, searchterm, \ path=path+obj.id+'.', all=all)) return result class IEEFolder(OFS.Folder.Folder, Persistent, Acquisition.Implicit, AccessControl.Role.RoleManager, OFS.PropertyManager.PropertyManager ): meta_type = 'IEE Folder' __ac_permissions__=( ('Read Access', ('manage_findPropertyForm', 'manage_findProperty', 'index_html', 'manage_main', 'manage_workspace', 'objectIds', 'objectValues', 'objectItems', '')), ('Write Access', ('manage_delObjects',))) manage_workspace__roles__=('Read Access','Write Access') manage_options = ( {'label': 'Folder View', 'action': 'index_html', 'image': 'folder-view'}, {'label': 'Search', 'action': 'manage_findPropertyForm', 'image': 'search'}, {'label': 'Undo', 'action': 'manage_UndoForm', 'image': 'undo'}) index_html = HTMLFile('index', globals()) manage_main = HTMLFile('index', globals()) manage_findPropertyForm=HTMLFile('findProperty', globals()) findPropertyResult=HTMLFile('findPropertyResult', globals()) manage_UndoForm=HTMLFile('undo', globals()) def filtered_objectIds(self): map(lambda x: x.id, filter(lambda x: getSecurityManager().checkPermission('Read Access', x), self.objectValues())) def manage_findProperty(self, searchterm, props=[], allprops='all'): """Find a property.""" if type(props) is type(''): props=[props] if allprops == 'all': allprops = 1 else: allprops = 0 return self.findPropertyResult(self, result=findProperty(self.objectValues(), props, searchterm, all=allprops), URL=self.absolute_url()) __doc__ = """IEEShare product module.""" __version__ = '0.1' import nis,traceback from Globals import HTMLFile,MessageDialog,Persistent from Products.CARS.IEEFolder import IEEFolder from Products.CARS.NisLogin import NisLogin from Products.LoginManager.LoginManager import manage_addLoginManager from Globals import HTMLFile manage_addIEEShareForm = HTMLFile('ieeshareAdd', globals()) def manage_addIEEShare(self, id, title=None, REQUEST=None): """Add an IEE Share to a folder.""" ob=IEEShare() ob.id=str(id) ob.title=title self._setObject(id, ob) ob=self._getOb(id) ob.manage_role('Read Access', permissions=('Read Access',)) ob.manage_role('Write Access', permissions=('Write Access',)) #manage_addLoginManager(ob, usource='NIS User Source') if REQUEST is not None: return self.manage_main(self,
Re: [Zope-dev] hmmm.. wierd permission issues with getPersistentItemIDs()...
Hi Steve, Thanks for the reply. Of course as soon as I reported this, I went away for a couple days and I haven't been able to check the list. It appears that the problem is that the BTreeItems object returned by getPersistentObjectIDs isn't currently allowed as an argument of 'in' by itself since it's not in the 'containerAssertions' dictionary defined in SimpleObjectPolicies.py and it doesn't have the magic property: '__allow_access_to_unprotected_subobjects__'. If you *sort* the BTreeItems object however, the dtml-in tag makes a copy of the items in the BTreeItems object as a simple List, and sorts that rather than destructively attempting to sort the original. The simple list is in containerAssertions, and is therefore allowed. I was wrong about the it's only that seems to cause the problem. The odd thing is that the method 'getPersistentObjectIDs' is correctly included in the definition of __ac_permissions__ in Rack.py, but as you point out, it returns a BTreeItems object that doesn't want to play nice with . Once possible solution would be to add an '__allow_access_to_unprotected_subobjects__' property to the BTreeItems object. I'm not sure who should do that. maybe Rack.py? For now.. I'll just sort the ids. ;-) thanks, -steve > "Steve" == Steve Alexander <[EMAIL PROTECTED]> writes: Steve> Steve Spicklemire wrote: >> Hi ZPatterns folks... >> >> ZPatterns-0.4.1snap1 Zope2.2.0-src >> >> I have a specialist with a defaultRack storing DataSkin >> subclassed ZClass instances with only persistent attribute >> providers. >> >> Steve> When I call that, I get . To Steve> get that list of IDs, I use an external method: Steve> def get_persistent_ids(self): try: items = Steve> self.defaultRack.aq_base.getPersistentItemIDs() return Steve> map(lambda x: x, items) Steve> except: import sys, traceback, string etype, val, tb = Steve> sys.exc_info() Steve> sys.stderr.write(string.join(traceback.format_exception(etype, Steve> val, tb),'')) del etype, val, tb Steve> I've tried something like your code, with no sheetproviders Steve> in the rack. I can't reproduce your error. I'm using the Steve> method as a Manager. >> or >> >> ... >> >> raise AuthorizationFailed >> >> ... >> >> >> works fine. What did I do now? ;-) Steve> Line 318, Rack.py. The method getPersistentItemIDs has no Steve> docstring. Is that still significant under the new security Steve> model? Steve> Does the user you're running the method as have the Steve> permission "Access contents information" ? Steve> Looks like you may have uncovered a Zope security bug in Steve> :-/ Steve> How could we test this further? Steve> -- Steve Alexander Software Engineer Cat-Box limited Steve> http://www.cat-box.net Steve> ___ Zope-Dev Steve> maillist - [EMAIL PROTECTED] Steve> http://lists.zope.org/mailman/listinfo/zope-dev ** No cross Steve> posts or HTML encoding! ** (Related lists - Steve> http://lists.zope.org/mailman/listinfo/zope-announce Steve> http://lists.zope.org/mailman/listinfo/zope ) ___ Zope-Dev maillist - [EMAIL PROTECTED] http://lists.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope )
[Zope-dev] Python class and ZClass
Hi, I would like to subclass a Python class with a ZClass. I mean that my DTML document will declare ZClass and I would like to be able to access the method declared in the python class (parent of the ZClass). Does anybody knows how to do that ? PS : I read a paper call "SubClassing from Custom Python classes", but I did not understood everything, so if somebody can give me an example or the main steps to solve my problem... Thank you very very much Vincent ___ Zope-Dev maillist - [EMAIL PROTECTED] http://lists.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://lists.zope.org/mailman/listinfo/zope-announce http://lists.zope.org/mailman/listinfo/zope )