Re: SearchBlox J2EE Search Component Version 1.1 released

2003-12-03 Thread Tatu Saloranta
On Tuesday 02 December 2003 09:51, Tun Lin wrote:
 Anyone knows a search engine that supports xml formats?

There's no way to generally support xml formats, as xml is just a 
meta-language. However, building specific search engines using Lucene core it 
should be reasonably straight-forward to implement more accurate 
xml-structure-aware tokenization for specific xml applications like DocBook 
or other domain-specific apps.
So, if any search engine advertises indexing xml content, one better read 
the fine print to learn what they really claim.

It might be interesting to create a Lucene plug-in that, given a specification 
of how sub trees under specific elements, would tokenize and index content 
into separate fields. Plus implementation shouldn't be very difficult -- just 
use standard XML parser (SAX, DOM) -- and then match xpaths, feed that to 
analyzer and then add to index. This could also be used for HTML 
(pre-filtering with JTidy or similar first to get to xml-compliant HTML).
I wouldn't be surprised if someone on list has already done this?

-+ Tatu +-



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: SearchBlox J2EE Search Component Version 1.1 released

2003-12-02 Thread Tate Avery

If you buy it, apparently:
http://www.searchblox.com/buy.html



-Original Message-
From: Tun Lin [mailto:[EMAIL PROTECTED]
Sent: Tuesday, December 02, 2003 10:43 AM
To: 'Lucene Users List'; [EMAIL PROTECTED]
Subject: RE: SearchBlox J2EE Search Component Version 1.1 released


Hi,

Just a feedback.

SearchBlox can only search for html files. Will Searchblox support pdf, xml and
word documents in future? It will be perfect if it can support all document
types mentioned above.

-Original Message-
From: Robert Selvaraj [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, December 02, 2003 10:42 PM
To: Lucene Users List; [EMAIL PROTECTED]
Subject: SearchBlox J2EE Search Component Version 1.1 released

SearchBlox is a J2EE search component that enables you to add search
functionality to your applications, intranets or portals in a matter of minutes.
SearchBlox uses Lucene Search API and features integrated HTTP and File System
crawlers, support for different document formats, support for indexing and
searching content in 15 languages and customizable search results, all
controlled from a browser-based Admin Console.


Main features in this update:
=
- Asian language support. SearchBlox now supports Japanese, Chinese Simplified,
Chinese Traditional and Korean language content.
- Performance enhancements to search
- Improved Hit Highlighting

SearchBlox is available as a Web Archive (WAR) and is deployable on any Servlet
2.3/JSP 1.2 compliant server. SearchBlox Getting-Started Guides are available
for the following servers:

JBoss - http://www.searchblox.com/gettingstarted_jboss.html
Jetty - http://www.searchblox.com/gettingstarted_jetty.html
JRun - http://www.searchblox.com/gettingstarted_jrun.html
Pramati - http://www.searchblox.com/gettingstarted_pramati.html
Resin - http://www.searchblox.com/gettingstarted_resin.html
Tomcat - http://www.searchblox.com/gettingstarted_tomcat.html
Weblogic - http://www.searchblox.com/gettingstarted_weblogic.html
Websphere - http://www.searchblox.com/gettingstarted_websphere.html


The SearchBlox FREE Edition is available free of charge and can index up to 1000
HTML documents.

The software can be downloaded from http://www.searchblox.com



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: SearchBlox J2EE Search Component Version 1.1 released

2003-12-02 Thread Tun Lin
Hi,

Does it support xml?  

-Original Message-
From: Tate Avery [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, December 02, 2003 11:45 PM
To: Lucene Users List
Subject: RE: SearchBlox J2EE Search Component Version 1.1 released


If you buy it, apparently:
http://www.searchblox.com/buy.html



-Original Message-
From: Tun Lin [mailto:[EMAIL PROTECTED]
Sent: Tuesday, December 02, 2003 10:43 AM
To: 'Lucene Users List'; [EMAIL PROTECTED]
Subject: RE: SearchBlox J2EE Search Component Version 1.1 released


Hi,

Just a feedback.

SearchBlox can only search for html files. Will Searchblox support pdf, xml and
word documents in future? It will be perfect if it can support all document
types mentioned above.

-Original Message-
From: Robert Selvaraj [mailto:[EMAIL PROTECTED]
Sent: Tuesday, December 02, 2003 10:42 PM
To: Lucene Users List; [EMAIL PROTECTED]
Subject: SearchBlox J2EE Search Component Version 1.1 released

SearchBlox is a J2EE search component that enables you to add search
functionality to your applications, intranets or portals in a matter of minutes.
SearchBlox uses Lucene Search API and features integrated HTTP and File System
crawlers, support for different document formats, support for indexing and
searching content in 15 languages and customizable search results, all
controlled from a browser-based Admin Console.


Main features in this update:
=
- Asian language support. SearchBlox now supports Japanese, Chinese Simplified,
Chinese Traditional and Korean language content.
- Performance enhancements to search
- Improved Hit Highlighting

SearchBlox is available as a Web Archive (WAR) and is deployable on any Servlet
2.3/JSP 1.2 compliant server. SearchBlox Getting-Started Guides are available
for the following servers:

JBoss - http://www.searchblox.com/gettingstarted_jboss.html
Jetty - http://www.searchblox.com/gettingstarted_jetty.html
JRun - http://www.searchblox.com/gettingstarted_jrun.html
Pramati - http://www.searchblox.com/gettingstarted_pramati.html
Resin - http://www.searchblox.com/gettingstarted_resin.html
Tomcat - http://www.searchblox.com/gettingstarted_tomcat.html
Weblogic - http://www.searchblox.com/gettingstarted_weblogic.html
Websphere - http://www.searchblox.com/gettingstarted_websphere.html


The SearchBlox FREE Edition is available free of charge and can index up to 1000
HTML documents.

The software can be downloaded from http://www.searchblox.com



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: SearchBlox J2EE Search Component Version 1.1 released

2003-12-02 Thread Robert Selvaraj
No.

The formats supported by SearchBlox given here :

	http://www.searchblox.com/faqs/question.php?qstId=5

Tun Lin wrote:

Hi,

Does it support xml?  

-Original Message-
From: Tate Avery [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, December 02, 2003 11:45 PM
To: Lucene Users List
Subject: RE: SearchBlox J2EE Search Component Version 1.1 released

If you buy it, apparently:
http://www.searchblox.com/buy.html


-Original Message-
From: Tun Lin [mailto:[EMAIL PROTECTED]
Sent: Tuesday, December 02, 2003 10:43 AM
To: 'Lucene Users List'; [EMAIL PROTECTED]
Subject: RE: SearchBlox J2EE Search Component Version 1.1 released
Hi,

Just a feedback.

SearchBlox can only search for html files. Will Searchblox support pdf, xml and
word documents in future? It will be perfect if it can support all document
types mentioned above.
-Original Message-
From: Robert Selvaraj [mailto:[EMAIL PROTECTED]
Sent: Tuesday, December 02, 2003 10:42 PM
To: Lucene Users List; [EMAIL PROTECTED]
Subject: SearchBlox J2EE Search Component Version 1.1 released
SearchBlox is a J2EE search component that enables you to add search
functionality to your applications, intranets or portals in a matter of minutes.
SearchBlox uses Lucene Search API and features integrated HTTP and File System
crawlers, support for different document formats, support for indexing and
searching content in 15 languages and customizable search results, all
controlled from a browser-based Admin Console.
Main features in this update:
=
- Asian language support. SearchBlox now supports Japanese, Chinese Simplified,
Chinese Traditional and Korean language content.
- Performance enhancements to search
- Improved Hit Highlighting
SearchBlox is available as a Web Archive (WAR) and is deployable on any Servlet
2.3/JSP 1.2 compliant server. SearchBlox Getting-Started Guides are available
for the following servers:
JBoss - http://www.searchblox.com/gettingstarted_jboss.html
Jetty - http://www.searchblox.com/gettingstarted_jetty.html
JRun - http://www.searchblox.com/gettingstarted_jrun.html
Pramati - http://www.searchblox.com/gettingstarted_pramati.html
Resin - http://www.searchblox.com/gettingstarted_resin.html
Tomcat - http://www.searchblox.com/gettingstarted_tomcat.html
Weblogic - http://www.searchblox.com/gettingstarted_weblogic.html
Websphere - http://www.searchblox.com/gettingstarted_websphere.html

The SearchBlox FREE Edition is available free of charge and can index up to 1000
HTML documents.
The software can be downloaded from http://www.searchblox.com



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


SearchBlox J2EE Search Component Version 1.1 released

2003-12-02 Thread Robert Selvaraj
SearchBlox is a J2EE search component that enables you to add search 
functionality to your applications, intranets or portals in a matter of 
minutes. SearchBlox uses Lucene Search API and features integrated HTTP 
and File System crawlers, support for different document formats, 
support for indexing and searching content in 15 languages and 
customizable search results, all controlled from a browser-based Admin 
Console.

Main features in this update:
=
- Asian language support. SearchBlox now supports Japanese, Chinese 
Simplified, Chinese Traditional and Korean language content.
- Performance enhancements to search
- Improved Hit Highlighting

SearchBlox is available as a Web Archive (WAR) and is deployable on any 
Servlet 2.3/JSP 1.2 compliant server. SearchBlox Getting-Started Guides 
are available for the following servers:

JBoss - http://www.searchblox.com/gettingstarted_jboss.html
Jetty - http://www.searchblox.com/gettingstarted_jetty.html
JRun - http://www.searchblox.com/gettingstarted_jrun.html
Pramati - http://www.searchblox.com/gettingstarted_pramati.html
Resin - http://www.searchblox.com/gettingstarted_resin.html
Tomcat - http://www.searchblox.com/gettingstarted_tomcat.html
Weblogic - http://www.searchblox.com/gettingstarted_weblogic.html
Websphere - http://www.searchblox.com/gettingstarted_websphere.html
	
The SearchBlox FREE Edition is available free of charge and can index up 
to 1000 HTML documents.

The software can be downloaded from http://www.searchblox.com



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


RE: SearchBlox J2EE Search Component Version 1.1 released

2003-12-02 Thread Tun Lin
Hi,

Just a feedback.

SearchBlox can only search for html files. Will Searchblox support pdf, xml and
word documents in future? It will be perfect if it can support all document
types mentioned above.

-Original Message-
From: Robert Selvaraj [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, December 02, 2003 10:42 PM
To: Lucene Users List; [EMAIL PROTECTED]
Subject: SearchBlox J2EE Search Component Version 1.1 released

SearchBlox is a J2EE search component that enables you to add search
functionality to your applications, intranets or portals in a matter of minutes.
SearchBlox uses Lucene Search API and features integrated HTTP and File System
crawlers, support for different document formats, support for indexing and
searching content in 15 languages and customizable search results, all
controlled from a browser-based Admin Console.


Main features in this update:
=
- Asian language support. SearchBlox now supports Japanese, Chinese Simplified,
Chinese Traditional and Korean language content.
- Performance enhancements to search
- Improved Hit Highlighting

SearchBlox is available as a Web Archive (WAR) and is deployable on any Servlet
2.3/JSP 1.2 compliant server. SearchBlox Getting-Started Guides are available
for the following servers:

JBoss - http://www.searchblox.com/gettingstarted_jboss.html
Jetty - http://www.searchblox.com/gettingstarted_jetty.html
JRun - http://www.searchblox.com/gettingstarted_jrun.html
Pramati - http://www.searchblox.com/gettingstarted_pramati.html
Resin - http://www.searchblox.com/gettingstarted_resin.html
Tomcat - http://www.searchblox.com/gettingstarted_tomcat.html
Weblogic - http://www.searchblox.com/gettingstarted_weblogic.html
Websphere - http://www.searchblox.com/gettingstarted_websphere.html


The SearchBlox FREE Edition is available free of charge and can index up to 1000
HTML documents.

The software can be downloaded from http://www.searchblox.com



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]