RE: Simple XML and Office extractor configs
Hi Warwick, The MS office extractor will extract the OLE metadata, like author and date modified, from office documents. The example directory /slide/doc is the directory you want it to extract properties from. The metadata is stored as regular DAV properties once extracted. That directory doesn't exist. It is just an example. The overhead of using /slide/files is probably too much for people who don't need it. I'm not sure about the xml extractor, but I was getting the same exception on init. I think it is a bug because I don't remember getting it a couple months ago. I just disabled it. -Ryan _ From: Warwick Burrows [mailto:[EMAIL PROTECTED] Sent: Monday, August 30, 2004 4:45 PM To: '[EMAIL PROTECTED]' Subject: Simple XML and Office extractor configs Hi, The simple and office extractors that are configured in the Domain.xml file by default. What do they do? ie. One is configured to /slide/articles/test.xml and the other to /slide/doc. Do they create these dirs and put data in them or do they only extract properties from files under those directories? Do I need to have a test.xml to make it work? I'm getting an xpath failure from the simple extractor init process and am not sure whether how they should be configured. Slide starts but I'm guessing that the extractors (or at least one) didn't load. Thanks, Warwick http://www.e2open.com/ _ Warwick Burrows Senior Software Engineer Email: [EMAIL PROTECTED] Fax: 512.343.8727 9600 Great Hills Trail, #325 Austin, TX 78759 http://www.e2open.com http://www.e2open.com/ _
RE: Simple XML and Office extractor configs
Thanks. So if a new document is uploaded under /slide/doc then the office extractor will extract properties from it but not otherwise. Will it also process any files that are already there when it starts or would you need to upload them for it to process the properties again? I would like it enabled for my whole tree but I'm not sure what you mean when you say it may be too much for most people. Is there performance problems with the implementation? Thanks, Warwick -Original Message- From: Ryan Rhodes [mailto:[EMAIL PROTECTED] Sent: Monday, August 30, 2004 7:46 PM To: 'Slide Users Mailing List' Subject: RE: Simple XML and Office extractor configs Hi Warwick, The MS office extractor will extract the OLE metadata, like author and date modified, from office documents. The example directory /slide/doc is the directory you want it to extract properties from. The metadata is stored as regular DAV properties once extracted. That directory doesn't exist. It is just an example. The overhead of using /slide/files is probably too much for people who don't need it. I'm not sure about the xml extractor, but I was getting the same exception on init. I think it is a bug because I don't remember getting it a couple months ago. I just disabled it. -Ryan _ From: Warwick Burrows [mailto:[EMAIL PROTECTED] Sent: Monday, August 30, 2004 4:45 PM To: '[EMAIL PROTECTED]' Subject: Simple XML and Office extractor configs Hi, The simple and office extractors that are configured in the Domain.xml file by default. What do they do? ie. One is configured to /slide/articles/test.xml and the other to /slide/doc. Do they create these dirs and put data in them or do they only extract properties from files under those directories? Do I need to have a test.xml to make it work? I'm getting an xpath failure from the simple extractor init process and am not sure whether how they should be configured. Slide starts but I'm guessing that the extractors (or at least one) didn't load. Thanks, Warwick http://www.e2open.com/ _ Warwick Burrows Senior Software Engineer Email: [EMAIL PROTECTED] Fax: 512.343.8727 9600 Great Hills Trail, #325 Austin, TX 78759 http://www.e2open.com http://www.e2open.com/ _ - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Simple XML and Office extractor configs
Actually, I don't know anything about the performance. I only meant that it was turned off by default and that was an example config. I've been using the MS content extractors I wrote only in a test environment, and I haven't had performance problems yet. The OLE metadata is the fastest thing you can extract, but I'm not sure if that gets called much more often than content extractors. The content extractor definitely only saves the content when it is originally saved or updated. I am guessing the property extractors do the same thing. This is actually a big problem I have with it right now. It would be easy to write a recursive touch tool, but the modification dates are meaningful in the application and I don't want them changed. Any ideas here would be great! -Ryan -Original Message- From: Warwick Burrows [mailto:[EMAIL PROTECTED] Sent: Monday, August 30, 2004 9:12 PM To: 'Slide Users Mailing List' Subject: RE: Simple XML and Office extractor configs Thanks. So if a new document is uploaded under /slide/doc then the office extractor will extract properties from it but not otherwise. Will it also process any files that are already there when it starts or would you need to upload them for it to process the properties again? I would like it enabled for my whole tree but I'm not sure what you mean when you say it may be too much for most people. Is there performance problems with the implementation? Thanks, Warwick -Original Message- From: Ryan Rhodes [mailto:[EMAIL PROTECTED] Sent: Monday, August 30, 2004 7:46 PM To: 'Slide Users Mailing List' Subject: RE: Simple XML and Office extractor configs Hi Warwick, The MS office extractor will extract the OLE metadata, like author and date modified, from office documents. The example directory /slide/doc is the directory you want it to extract properties from. The metadata is stored as regular DAV properties once extracted. That directory doesn't exist. It is just an example. The overhead of using /slide/files is probably too much for people who don't need it. I'm not sure about the xml extractor, but I was getting the same exception on init. I think it is a bug because I don't remember getting it a couple months ago. I just disabled it. -Ryan _ From: Warwick Burrows [mailto:[EMAIL PROTECTED] Sent: Monday, August 30, 2004 4:45 PM To: '[EMAIL PROTECTED]' Subject: Simple XML and Office extractor configs Hi, The simple and office extractors that are configured in the Domain.xml file by default. What do they do? ie. One is configured to /slide/articles/test.xml and the other to /slide/doc. Do they create these dirs and put data in them or do they only extract properties from files under those directories? Do I need to have a test.xml to make it work? I'm getting an xpath failure from the simple extractor init process and am not sure whether how they should be configured. Slide starts but I'm guessing that the extractors (or at least one) didn't load. Thanks, Warwick http://www.e2open.com/ _ Warwick Burrows Senior Software Engineer Email: [EMAIL PROTECTED] Fax: 512.343.8727 9600 Great Hills Trail, #325 Austin, TX 78759 http://www.e2open.com http://www.e2open.com/ _ - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]