Re: SearchBlox J2EE Search Component Version 1.1 released
On Tuesday 02 December 2003 09:51, Tun Lin wrote: Anyone knows a search engine that supports xml formats? There's no way to generally support xml formats, as xml is just a meta-language. However, building specific search engines using Lucene core it should be reasonably straight-forward to implement more accurate xml-structure-aware tokenization for specific xml applications like DocBook or other domain-specific apps. So, if any search engine advertises indexing xml content, one better read the fine print to learn what they really claim. It might be interesting to create a Lucene plug-in that, given a specification of how sub trees under specific elements, would tokenize and index content into separate fields. Plus implementation shouldn't be very difficult -- just use standard XML parser (SAX, DOM) -- and then match xpaths, feed that to analyzer and then add to index. This could also be used for HTML (pre-filtering with JTidy or similar first to get to xml-compliant HTML). I wouldn't be surprised if someone on list has already done this? -+ Tatu +- - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: SearchBlox J2EE Search Component Version 1.1 released
If you buy it, apparently: http://www.searchblox.com/buy.html -Original Message- From: Tun Lin [mailto:[EMAIL PROTECTED] Sent: Tuesday, December 02, 2003 10:43 AM To: 'Lucene Users List'; [EMAIL PROTECTED] Subject: RE: SearchBlox J2EE Search Component Version 1.1 released Hi, Just a feedback. SearchBlox can only search for html files. Will Searchblox support pdf, xml and word documents in future? It will be perfect if it can support all document types mentioned above. -Original Message- From: Robert Selvaraj [mailto:[EMAIL PROTECTED] Sent: Tuesday, December 02, 2003 10:42 PM To: Lucene Users List; [EMAIL PROTECTED] Subject: SearchBlox J2EE Search Component Version 1.1 released SearchBlox is a J2EE search component that enables you to add search functionality to your applications, intranets or portals in a matter of minutes. SearchBlox uses Lucene Search API and features integrated HTTP and File System crawlers, support for different document formats, support for indexing and searching content in 15 languages and customizable search results, all controlled from a browser-based Admin Console. Main features in this update: = - Asian language support. SearchBlox now supports Japanese, Chinese Simplified, Chinese Traditional and Korean language content. - Performance enhancements to search - Improved Hit Highlighting SearchBlox is available as a Web Archive (WAR) and is deployable on any Servlet 2.3/JSP 1.2 compliant server. SearchBlox Getting-Started Guides are available for the following servers: JBoss - http://www.searchblox.com/gettingstarted_jboss.html Jetty - http://www.searchblox.com/gettingstarted_jetty.html JRun - http://www.searchblox.com/gettingstarted_jrun.html Pramati - http://www.searchblox.com/gettingstarted_pramati.html Resin - http://www.searchblox.com/gettingstarted_resin.html Tomcat - http://www.searchblox.com/gettingstarted_tomcat.html Weblogic - http://www.searchblox.com/gettingstarted_weblogic.html Websphere - http://www.searchblox.com/gettingstarted_websphere.html The SearchBlox FREE Edition is available free of charge and can index up to 1000 HTML documents. The software can be downloaded from http://www.searchblox.com - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: SearchBlox J2EE Search Component Version 1.1 released
Hi, Does it support xml? -Original Message- From: Tate Avery [mailto:[EMAIL PROTECTED] Sent: Tuesday, December 02, 2003 11:45 PM To: Lucene Users List Subject: RE: SearchBlox J2EE Search Component Version 1.1 released If you buy it, apparently: http://www.searchblox.com/buy.html -Original Message- From: Tun Lin [mailto:[EMAIL PROTECTED] Sent: Tuesday, December 02, 2003 10:43 AM To: 'Lucene Users List'; [EMAIL PROTECTED] Subject: RE: SearchBlox J2EE Search Component Version 1.1 released Hi, Just a feedback. SearchBlox can only search for html files. Will Searchblox support pdf, xml and word documents in future? It will be perfect if it can support all document types mentioned above. -Original Message- From: Robert Selvaraj [mailto:[EMAIL PROTECTED] Sent: Tuesday, December 02, 2003 10:42 PM To: Lucene Users List; [EMAIL PROTECTED] Subject: SearchBlox J2EE Search Component Version 1.1 released SearchBlox is a J2EE search component that enables you to add search functionality to your applications, intranets or portals in a matter of minutes. SearchBlox uses Lucene Search API and features integrated HTTP and File System crawlers, support for different document formats, support for indexing and searching content in 15 languages and customizable search results, all controlled from a browser-based Admin Console. Main features in this update: = - Asian language support. SearchBlox now supports Japanese, Chinese Simplified, Chinese Traditional and Korean language content. - Performance enhancements to search - Improved Hit Highlighting SearchBlox is available as a Web Archive (WAR) and is deployable on any Servlet 2.3/JSP 1.2 compliant server. SearchBlox Getting-Started Guides are available for the following servers: JBoss - http://www.searchblox.com/gettingstarted_jboss.html Jetty - http://www.searchblox.com/gettingstarted_jetty.html JRun - http://www.searchblox.com/gettingstarted_jrun.html Pramati - http://www.searchblox.com/gettingstarted_pramati.html Resin - http://www.searchblox.com/gettingstarted_resin.html Tomcat - http://www.searchblox.com/gettingstarted_tomcat.html Weblogic - http://www.searchblox.com/gettingstarted_weblogic.html Websphere - http://www.searchblox.com/gettingstarted_websphere.html The SearchBlox FREE Edition is available free of charge and can index up to 1000 HTML documents. The software can be downloaded from http://www.searchblox.com - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: SearchBlox J2EE Search Component Version 1.1 released
No. The formats supported by SearchBlox given here : http://www.searchblox.com/faqs/question.php?qstId=5 Tun Lin wrote: Hi, Does it support xml? -Original Message- From: Tate Avery [mailto:[EMAIL PROTECTED] Sent: Tuesday, December 02, 2003 11:45 PM To: Lucene Users List Subject: RE: SearchBlox J2EE Search Component Version 1.1 released If you buy it, apparently: http://www.searchblox.com/buy.html -Original Message- From: Tun Lin [mailto:[EMAIL PROTECTED] Sent: Tuesday, December 02, 2003 10:43 AM To: 'Lucene Users List'; [EMAIL PROTECTED] Subject: RE: SearchBlox J2EE Search Component Version 1.1 released Hi, Just a feedback. SearchBlox can only search for html files. Will Searchblox support pdf, xml and word documents in future? It will be perfect if it can support all document types mentioned above. -Original Message- From: Robert Selvaraj [mailto:[EMAIL PROTECTED] Sent: Tuesday, December 02, 2003 10:42 PM To: Lucene Users List; [EMAIL PROTECTED] Subject: SearchBlox J2EE Search Component Version 1.1 released SearchBlox is a J2EE search component that enables you to add search functionality to your applications, intranets or portals in a matter of minutes. SearchBlox uses Lucene Search API and features integrated HTTP and File System crawlers, support for different document formats, support for indexing and searching content in 15 languages and customizable search results, all controlled from a browser-based Admin Console. Main features in this update: = - Asian language support. SearchBlox now supports Japanese, Chinese Simplified, Chinese Traditional and Korean language content. - Performance enhancements to search - Improved Hit Highlighting SearchBlox is available as a Web Archive (WAR) and is deployable on any Servlet 2.3/JSP 1.2 compliant server. SearchBlox Getting-Started Guides are available for the following servers: JBoss - http://www.searchblox.com/gettingstarted_jboss.html Jetty - http://www.searchblox.com/gettingstarted_jetty.html JRun - http://www.searchblox.com/gettingstarted_jrun.html Pramati - http://www.searchblox.com/gettingstarted_pramati.html Resin - http://www.searchblox.com/gettingstarted_resin.html Tomcat - http://www.searchblox.com/gettingstarted_tomcat.html Weblogic - http://www.searchblox.com/gettingstarted_weblogic.html Websphere - http://www.searchblox.com/gettingstarted_websphere.html The SearchBlox FREE Edition is available free of charge and can index up to 1000 HTML documents. The software can be downloaded from http://www.searchblox.com - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
SearchBlox J2EE Search Component Version 1.1 released
SearchBlox is a J2EE search component that enables you to add search functionality to your applications, intranets or portals in a matter of minutes. SearchBlox uses Lucene Search API and features integrated HTTP and File System crawlers, support for different document formats, support for indexing and searching content in 15 languages and customizable search results, all controlled from a browser-based Admin Console. Main features in this update: = - Asian language support. SearchBlox now supports Japanese, Chinese Simplified, Chinese Traditional and Korean language content. - Performance enhancements to search - Improved Hit Highlighting SearchBlox is available as a Web Archive (WAR) and is deployable on any Servlet 2.3/JSP 1.2 compliant server. SearchBlox Getting-Started Guides are available for the following servers: JBoss - http://www.searchblox.com/gettingstarted_jboss.html Jetty - http://www.searchblox.com/gettingstarted_jetty.html JRun - http://www.searchblox.com/gettingstarted_jrun.html Pramati - http://www.searchblox.com/gettingstarted_pramati.html Resin - http://www.searchblox.com/gettingstarted_resin.html Tomcat - http://www.searchblox.com/gettingstarted_tomcat.html Weblogic - http://www.searchblox.com/gettingstarted_weblogic.html Websphere - http://www.searchblox.com/gettingstarted_websphere.html The SearchBlox FREE Edition is available free of charge and can index up to 1000 HTML documents. The software can be downloaded from http://www.searchblox.com - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: SearchBlox J2EE Search Component Version 1.1 released
Hi, Just a feedback. SearchBlox can only search for html files. Will Searchblox support pdf, xml and word documents in future? It will be perfect if it can support all document types mentioned above. -Original Message- From: Robert Selvaraj [mailto:[EMAIL PROTECTED] Sent: Tuesday, December 02, 2003 10:42 PM To: Lucene Users List; [EMAIL PROTECTED] Subject: SearchBlox J2EE Search Component Version 1.1 released SearchBlox is a J2EE search component that enables you to add search functionality to your applications, intranets or portals in a matter of minutes. SearchBlox uses Lucene Search API and features integrated HTTP and File System crawlers, support for different document formats, support for indexing and searching content in 15 languages and customizable search results, all controlled from a browser-based Admin Console. Main features in this update: = - Asian language support. SearchBlox now supports Japanese, Chinese Simplified, Chinese Traditional and Korean language content. - Performance enhancements to search - Improved Hit Highlighting SearchBlox is available as a Web Archive (WAR) and is deployable on any Servlet 2.3/JSP 1.2 compliant server. SearchBlox Getting-Started Guides are available for the following servers: JBoss - http://www.searchblox.com/gettingstarted_jboss.html Jetty - http://www.searchblox.com/gettingstarted_jetty.html JRun - http://www.searchblox.com/gettingstarted_jrun.html Pramati - http://www.searchblox.com/gettingstarted_pramati.html Resin - http://www.searchblox.com/gettingstarted_resin.html Tomcat - http://www.searchblox.com/gettingstarted_tomcat.html Weblogic - http://www.searchblox.com/gettingstarted_weblogic.html Websphere - http://www.searchblox.com/gettingstarted_websphere.html The SearchBlox FREE Edition is available free of charge and can index up to 1000 HTML documents. The software can be downloaded from http://www.searchblox.com - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]