Re: Mapping Lucene search results with a relational database

2012-07-03 Thread Chris Lu
Can you index the rule1 and rule2 fields into the documents, and when searching with the keywords, also append rule1:foo and rule2:bar to the query? Chris - Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net demo: http://search.dbs

Re: Bizarre Search order request

2012-05-25 Thread Chris Lu
Nothing like this yet. But you don't need to do everything in one search request. You can send one search request to know that the match distribution for each document type, and then send 3 requests for 3 document types each. -- Chris Lu - Instant Scalable Full

Re: Is there any "Query" in Lucene can search the term, which is similar as "SQL-LIKE"?

2011-10-11 Thread Chris Lu
You need to analyze the search keyword with the same analyzer that's applied on the "content" field. -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net demo: http://search.dbsight.com Lucene Datab

Re: Distributing a Lucene application?

2011-03-24 Thread Chris Lu
p and running in several minutes. You can even embed a widget to put search UI to any page. btw, DBSight also has facet search. Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net demo: http://search.dbsight.com Lucene Database S

Re: Distributing a Lucene application?

2011-03-22 Thread Chris Lu
does, this means the NY index could return results not found in NY database, correct? -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net demo: http://search.dbsight.com Lucene Database Search in 3 minutes: http

Re: Ui Framework for lucene

2010-12-07 Thread Chris Lu
You can use DBSight for database search. You just need to give it one or several SQLs. And you can generate search result template, and it will manage the index for you. http://www.dbsight.net -- -- Chris Lu - Instant Scalable Full-Text Search On Any Database

Re: High frequency term for the searched query

2010-11-04 Thread Chris Lu
After you get the query object, you can use IndexSearcher's function docFreq(), like this final Set terms = new HashSet(); query = searcher.rewrite(query); query.extractTerms(terms); for(Term t : terms){ int frequency = searcher.docFreq(t); } -- -- Chris Lu - In

Re: High frequency term for the searched query

2010-11-04 Thread Chris Lu
After you get the query object, you can use IndexSearcher's function docFreq(), like this final Set terms = new HashSet(); query = searcher.rewrite(query); query.extractTerms(terms); for(Term t : terms){ int frequency = irs.getSearcher().docFreq(t); } -- -- Chr

Re: does lucene support Database full text search

2010-09-10 Thread Chris Lu
Lucene does not support database directly. You need to pump data into Lucene. You can use DBSight, which has a built-in high performance crawler for any databases. It also has integrated Chinese analyzers, including IKAnalyzer, which is the best one I found so far. -- Chris Lu

Re: Federated search with opensearch or proprietary APIs for Atlassian

2010-09-02 Thread Chris Lu
more flexible with the structure, even dealing with data beyond Atlassian products. I guess that's the reason Google did not rely on each website's own search mechanism. -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.d

Re: Combine data from index and db before sorting and pagination

2010-09-01 Thread Chris Lu
ot;category_2", take doc5 and doc10 for example, after all the reindexing effort, the only changes is: "category_1": doc1,doc2. "category_2": doc3,doc4,doc5,doc7,doc8,doc10. Of course, to support this efficiently could be a big change, affecting all the nice

Re: Lucene applicability

2010-08-25 Thread Chris Lu
uld need a mechanism to get prepared and rebuild the index when you need to. -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net demo: http://search.dbsight.com Lucene Database Search in 3 minutes: http://wiki.dbsight.com

Re: Databases

2010-07-23 Thread Chris Lu
-time data import. Or you would have to put a hook in your program to write new content to the index. Anyway, you can get it work, but maybe not as simple as you expected. -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net

Re: Inserting data from multiple databases in same index

2010-07-22 Thread Chris Lu
several boxes and achieve sharded search. -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net demo: http://search.dbsight.com Lucene Database Search in 3 minutes: http://wiki.dbsight.com/index.php?title

Re: Will doc ids ever change if nothing is deleted?

2010-05-14 Thread Chris Lu
documents are added, the id is N+1. Of course, if some documents from other segments are merged. The documents in one segment will "lose" its doc id. -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net

Re: is there some dangerous bug in lucene?

2010-05-11 Thread Chris Lu
If you are using field cache for field A, and updating field A, isn't it normal that the field A is not updated? Field cache is keyed via index reader, it won't be efficient to reload the field cache for each updateDocument(). -- Chris Lu - Instant Scalable

Re: Lucene Challenge - sum, count, avg, etc.

2010-04-01 Thread Chris Lu
of rows are not really "that" big when everything is properly warmed up. -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net demo: http://search.dbsight.com Lucene Database Search in 3 minutes: http://wiki.db

Re: Lucene Challenge - sum, count, avg, etc.

2010-04-01 Thread Chris Lu
For DBSight, the aggregated values are computed during run time. And the sorting on the computed aggregated values are done when displaying the results. The idea is, after the aggregation, the number of aggregated values are much much smaller. -- Chris Lu - Instant

Re: Lucene Challenge - sum, count, avg, etc.

2010-04-01 Thread Chris Lu
ching. -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net demo: http://search.dbsight.com Lucene Database Search in 3 minutes: http://wiki.dbsight.com/index.php?title=Create_Lucene_Database_Search_in_3_minutes DBSight custom

Re: Lucene Challenge - sum, count, avg, etc.

2010-04-01 Thread Chris Lu
ly? It doesn't seem to be an opensource project so I can't really consider it. - Mike aka...@gmail.com On Thu, Apr 1, 2010 at 5:00 AM, Chris Lu wrote: Hi, Michel, This has already been implemented in DBSight. Check it out! http://www.dbsight.net You can get sum, avg for Facet

Re: Lucene Challenge - sum, count, avg, etc.

2010-04-01 Thread Chris Lu
Hi, Michel, This has already been implemented in DBSight. Check it out! http://www.dbsight.net You can get sum, avg for Facet searches. And count is included in Facet search directly. -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site

Re: NAS vs SAN vs Server Disk RAID

2010-02-25 Thread Chris Lu
To my experience, some customers used SAN to store the index. It's pretty good and fast. This may be a good choice for you, but it's costly. -- -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net

Re: If you could have one feature in Lucene...

2010-02-24 Thread Chris Lu
2 features: Search and serializeable Query class in java serializable object format, or XML, or json format. -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net demo: http://search.dbsight.com Lucene Database

Re: FastVectorHighlighter truncated queries

2010-02-23 Thread Chris Lu
This should be a common wildcard query highlighting problem. You will need to query.rewrite() first, and pass the result to the highlighter. -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net demo: http

Re: Improving Zend lucene search - general guidance?

2010-02-19 Thread Chris Lu
PHP doesn't have java like "static" variables, right? They are "stateless". All the information like term info that's loaded in the memory will be gone for the next search. You should use DBSight if you just have one week. -- Chris Lu --

Re: Query about Query.ToString()

2010-02-18 Thread Chris Lu
otocol buffer? Hopefully I only need to serialize it via query.toXML() or query.toBytes() and the parser can recognize the serialized forms. -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net demo: http://search

Re: Query about Query.ToString()

2010-02-17 Thread Chris Lu
XMLQueryParser is pretty good start. However, is it being maintained recently? I noticed many Query class are not supported, like PrefixQuery, or even PhraseQuery. Is it for some particular reason or simply lack of resource? -- Chris Lu - Instant Scalable Full-Text

Re: Scale Out

2010-02-08 Thread Chris Lu
Since you already have RMI interface, maybe you can parallel search on several nodes, collect the data, pick top ones, and send back results via RMI. -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net demo: http

Re: index a mysql database -blob field

2010-01-29 Thread Chris Lu
For blob, it is not so simple since BLOB could contain different file types, like HTML, pdf, word, zip file type. So besides getting results out via resultSet.getBlob() function, you will need to convert the binary stream into simple text strings. DBSight free version already can read the blog

Re: Can't start Lucene App: java.io.FileNotFoundException with brand new directory

2010-01-24 Thread Chris Lu
Think from another approach: You can check whether the index exists or not by IndexReader.*indexExists <../../../../org/apache/lucene/index/IndexReader.html#indexExists%28java.io.File%29>*(), and then determine what you want to do with the IndexWriter constructor. -- -- Ch

Re: Lucene as a primary datastore

2010-01-20 Thread Chris Lu
iliary data structure. It's only fast in one way, but could be slow in other ways. 3) The more robust approach is to pull data out of database, and create a Lucene index. In case something goes wrong, you can always pull data out again and create the index again. -- Chris Lu --

Re: IllegalArgumentException when IndexWriter.addDocument

2010-01-14 Thread Chris Lu
tokenstream, it is really never reset even across multiple documents). So whenever a stopword occurs it get larger... - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Chris Lu [mailto:chris...@gmail.com

Re: IllegalArgumentException when IndexWriter.addDocument

2010-01-14 Thread Chris Lu
Notes: I am using Lucene 3.0 Seems a integer overflow problem? java.lang.IllegalArgumentException: Increment must be zero or greater: -472893952 at org.apache.lucene.analysis.tokenattributes.PositionIncrementAttributeImpl.setPositionIncrement(PositionIncrementAttributeImpl.java:58) at org

IllegalArgumentException when IndexWriter.addDocument

2010-01-14 Thread Chris Lu
Seems a integer overflow problem? java.lang.IllegalArgumentException: Increment must be zero or greater: -472893952 at org.apache.lucene.analysis.tokenattributes.PositionIncrementAttributeImpl.setPositionIncrement(PositionIncrementAttributeImpl.java:58) at org.apache.lucene.analysis.StopFilt

Re: Switching from Store.YES to Store.NO

2010-01-05 Thread Chris Lu
Just curious, will it be adjusted during indexing when merging segments? Thanks! -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net demo: http://search.dbsight.com Lucene Database Search in 3 minutes: http

Re: Lucene Analyzer that can handle C++ vs C#

2009-12-11 Thread Chris Lu
What we did in DBSight is to provide a reserved list of words for every Lucene Analyzer. This way you can handle any special characters like C++ and C#. Any common analyzers usually are not suitable for these special words. -- Chris Lu - Instant Scalable Full-Text

Re: Lucene Java 3.0.0 RC1 now available for testing

2009-11-17 Thread Chris Lu
So will I need to use 2 fields, one filed is analyzed and the other field is binary, to replace one compressed fields previously? -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net demo: http://search.dbsight.com

Re: How to use Lucene to suppot quick search on huge databases where the primary content is of non textual format ?

2009-11-09 Thread Chris Lu
e for quick data access. For your exact matching, it may not help much. Creating and maintaining Lucene index is surely more coding than create a database index though. -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net

Re: Creating tag clouds with lucene

2009-11-05 Thread Chris Lu
eally have cared about the frequency in the search results? DBSight uses the multi-valued facet search approach to do tag cloud. Maybe I can "cheat" it this way also... It does save some memory. -- Chris Lu - Instant Scalable Full-Text Search On Any Datab

Re: Creating tag clouds with lucene

2009-11-05 Thread Chris Lu
Isn't the tag cloud just another facet search? Only difference is the tag is multi-valued. Basically just go through the search results and find all unique tag values. -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site:

Re: Facets

2009-11-03 Thread Chris Lu
. -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net demo: http://search.dbsight.com Lucene Database Search in 3 minutes: http://wiki.dbsight.com/index.php?title=Create_Lucene_Database_Search_in_3_minutes DBSight customer

Re: Merging Indexes

2009-10-26 Thread Chris Lu
Pretty sure you can delete the small indexes after the merge. BTW: How long does your indexing and merging take respectively? -- -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net demo: http://search.dbsight.com

Re: Performance tips when creating a large index from database.

2009-10-22 Thread Chris Lu
, and you can adjust the number of threads for database queries and also for indexing to find out your optimal data pulling configuration. -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net demo:

Re: consistent ordering of multi-values in a field

2009-07-07 Thread Chris Lu
That's great and thanks for the super fast answer! Another question if not thread-hijacking: Will the ordering of fields be preserved also? -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net demo:

consistent ordering of multi-values in a field

2009-07-07 Thread Chris Lu
,value2}? -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net demo: http://search.dbsight.com Lucene Database Search in 3 minutes: http://wiki.dbsight.com/index.php?title=Create_Lucene_Database_Search_in_3_minutes DBSight

Re: dbsight

2009-05-10 Thread Chris Lu
little Lucene specific configuration and some ready-to-use scaffolding. Compared to SOLR, DBSight's design is more like ruby-on-rails rather than a common web application. -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site:

Re: Can I run Lucene in google app engine?

2009-04-13 Thread Chris Lu
could be a good solution for small index with per-user data. 3) For large changing indexes, you need to find other solutions to maintain lucene index. My personal opinion is, finding a $20/month VPS hosting is far easier than changing the way to code. -- Chris Lu - I

Re: Creating lucene index from databases

2009-03-31 Thread Chris Lu
kranthi, Maybe you should use DBSight Lite to get started and get familiar with Lucene features. -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: _http://www.dbsight.net_ <http://www.dbsight.net/> demo: _http://search.dbsight.com_

Re: i18n numbers

2009-03-26 Thread Chris Lu
Marcel, First of all, do you really want the user to search price:19.99 ? Maybe you should use some logic like price>=19.99? If so, you should use range query to handle this case. -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: h

Re: Syncing lucene index with a database

2009-03-26 Thread Chris Lu
re not sure the proper index structure yet. I think you can use DBSight Free version, to rapidly prototype and experiment with all these choices, without coding any XML etc. -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.d

Re: Random sorting results

2009-03-21 Thread Chris Lu
Maybe you can adjust your ranking algorithm. For example, rank the most recent results higher? -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net demo: http://search.dbsight.com Lucene Database Search in 3 minutes

Re: Optimum way to find all document without particular field

2009-03-04 Thread Chris Lu
Allahbaksh, If you ONLY want to find all document with a particular field that is not null, you can loop through the TermEnum and TermDocs to find all the document ids. But this can not easily be combined with other queries. -- Chris Lu - Instant Scalable Full-Text

Re: Restricting the result set with hierarchical ACL

2009-03-02 Thread Chris Lu
belongs to, including the sub groups. Approach 2 should be more flexible. I don't think a user will have that many groups exceeding the default 1024. -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net demo: http://sea

Re: Merging database index with fulltext index

2009-02-28 Thread Chris Lu
y, ranking, etc. I think you better try it first. It's faster to install it, select the content with your sql, and get the search up and running, than reading introduction materials. -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://

Re: Merging database index with fulltext index

2009-02-28 Thread Chris Lu
Actually you can use DBSight(disclaimer:I work on it) to collect the data and keep them in sync. The free version has most the features and doesn't have size limit. -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsigh

Re: Merging database index with fulltext index

2009-02-28 Thread Chris Lu
I feel this may not be a good example. Since you can easily index field c, a, d and let Lucene to handle the filter "c = 'foo'" and the order by clause"order by a desc, d" -- Chris Lu - Instant Scalable Full-Text Search On Any D

Re: Optimal Solution for Unique Field Values

2009-02-15 Thread Chris Lu
I think you would need to 1) collect all the matching IDs for Field2=x 2) loop through Field1, for each Term's doc, collect the term if the term doc is in the matching IDs from step 1. This should be the fastest approach, pretty similar to what you suggested. -- Chr

Re: Multiple indexes vs single index

2009-02-14 Thread Chris Lu
eally, especially your QPS is not so demanding. -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net demo: http://search.dbsight.com Lucene Database Search in 3 minutes: http://wiki.dbsight.com/index.php?

Re: Faceted search with OpenBitSet/SortedVIntList

2009-02-07 Thread Chris Lu
To avoid creating a lot of objects and quickly throwing them away, you can adjust Eden memory size, or you can create a bunch of objects and try to re-use them. -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net demo:

Re: indexing database

2009-01-21 Thread Chris Lu
This is not a lucene question, but a jdbc question. The code is not releasing the jdbc connection, statement, and resultset, and what's worse, the code is creating new connections when paginating the results. -- Chris Lu - Instant Scalable Full-Text Search On Any Dat

Re: Search Problem

2009-01-02 Thread Chris Lu
Basically Lucene stores analyzed tokens, and looks up for the matches based on the tokens. "Amin" after StandardAnalyzer is "amin", so you need to use new Term("body", "amin"), instead of new Term("body", "Amin"), to search. -- Chris Lu

Re: Search Problem

2009-01-01 Thread Chris Lu
You need to let us know the analyzer you are using. -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net demo: http://search.dbsight.com Lucene Database Search in 3 minutes: http://wiki.dbsight.com/index.php?title

Re: duplication checking while indexing

2008-12-30 Thread Chris Lu
er to look it up. But it doesn't feel right. -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net demo: http://search.dbsight.com Lucene Database Search in 3 minutes: http://wiki.dbsight.com/index

Re: duplication checking while indexing

2008-12-29 Thread Chris Lu
Otis, thanks for the pointer. I think the question can be: How to access TermEnum or TermInfos during indexing. If this is possible, things would be easier. -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net demo

duplication checking while indexing

2008-12-29 Thread Chris Lu
all contents are flushed to disk yet. Is it possible to query the not-yet-closed index? -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net demo: http://search.dbsight.com Lucene Database Search in 3 minutes: http

Re: how to estimate how much memory is required to support the large index search

2008-11-17 Thread Chris Lu
So looks like you are not really doing much sorting? This index divisor affects reader.terms(), but not too much with sorting. -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net demo: http://search.dbsight.com Lucene

Re: how to estimate how much memory is required to support the large index search

2008-11-17 Thread Chris Lu
Calculation looks right. But what's the "Index divisor" that you mentioned? -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net demo: http://search.dbsight.com Lucene Database Search in

Re: Can lucene search from multi-index directory like using FK in database?

2008-11-05 Thread Chris Lu
approach may not be that bad, although less performant. 2. You will need to create a little query parser to distribute words into two fields. -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net demo: http://search.db

Re: Performance of never optimizing

2008-11-02 Thread Chris Lu
BTW: JIRA is great! -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net demo: http://search.dbsight.com Lucene Database Search in 3 minutes: http://wiki.dbsight.com/index.php?title=Create_Lucene_Database_Search_in_3_

Re: Lucene vs. Database

2008-10-01 Thread Chris Lu
e of the search result, just use database later on. But this isn't always correct. When you have 10 result per page, selecting the details from the database based on ids may not be that costly at all. -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Ap

Re: Memory eaten up by String, Term and TermInfo?

2008-09-14 Thread Chris Lu
Can you try to update to the latest Lucene svn version, like yesterday? LUCENE-1383 was checked in yesterday. This patch is addressing a leak problem particular to J2EE applications. -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http

Re: follow up of Lucene out of memory with RAMDirectory on J2EE environment

2008-09-13 Thread Chris Lu
ce for him, and thanks Michael McCandless for fixing the problem s quickly. -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net demo: http://search.dbsight.com Lucene Database Search in 3 minutes: http://wiki.dbsig

follow up of Lucene out of memory with RAMDirectory on J2EE environment

2008-09-10 Thread Chris Lu
a-dev @lucene.apache.org mailing list. -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net demo: http://search.dbsight.com Lucene Database Search in 3 minutes: http://wiki.dbsight.com/index.php?

Re: memory leak during Lucene Search

2008-09-09 Thread Chris Lu
problem is fixed. -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net demo: http://search.dbsight.com Lucene Database Search in 3 minutes: http://wiki.dbsight.com/index.php?title=Create_Lucene_Database_Search_in_3_minutes

Re: Building Relationships between documents?

2008-09-09 Thread Chris Lu
If you want to do it in just one search, yes, you have to put the Entities attributes into the documents. But you can search twice. The second time using values from the first search, say entitiy_id, to search the products. -- Chris Lu - Instant Scalable Full-Text Search

memory leak during Lucene Search

2008-09-06 Thread Chris Lu
lieve this will affect disk based index also. -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net demo: http://search.dbsight.com Lucene Database Search in 3 minutes: http://wiki.dbsight.com/index

Re: Lucene Memory Leak

2008-09-05 Thread Chris Lu
$CSIndexInput |- input of org.apache.lucene.index.SegmentTermEnum |- value of java.lang.ThreadLocal$ThreadLocalMap$Entry I am trying to track it down now. If anyone knows about it, please let me know. -- Chris Lu - Instant Scalable Full-Text

Re: search for empty field?

2008-09-03 Thread Chris Lu
term there. then all the unset bits are documents with empty fields. This should be kind of efficient. -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net demo: http://search.dbsight.com Lucene Database Search in 3 minutes

Re: search for empty field?

2008-09-03 Thread Chris Lu
Thanks Erick for reminding me of this! I only need to validate a index and make sure the content are correctly retrieved and index doesn't have empty fields. So I'd better simply go through all document by id and check them directly. Thanks! -- Chris Lu ---

search for empty field?

2008-09-02 Thread Chris Lu
Is it possible to query for documents that have empty values for a field? Say need to find documents with category empty, I tried negative query: -category:* But it returns 0 document. I think "category:*" is basically match all, so this "-category:*" doesn't wor

Re: Lucene Indexing DB records?

2008-08-22 Thread Chris Lu
-- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net demo: http://search.dbsight.com Lucene Database Search in 3 minutes: http://wiki.dbsight.com/index.php?title=Create_Lucene_Database_Search_in_3_minutes DBSight customer, a

Re: Results by unique id's

2008-08-12 Thread Chris Lu
Maybe re-organize the index structure as doc1:1; "car volvo", "car toyota" doc2;2;"car mitsubishi", "car skoda" You can add the content field twice for the same company_id. -- Chris Lu - Instant Scalable Full-Text Searc

Re: Bug in Sun's 1.6 hotspot compiler that can cause index corruption

2008-07-30 Thread Chris Lu
Thanks!!! This would really save us a lot of efforts! -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net demo: http://search.dbsight.com Lucene Database Search in 3 minutes: http://wiki.dbsight.com/index.php?title

Re: Using lucene as a database... good idea or bad idea?

2008-07-29 Thread Chris Lu
. The normal usage you listed sounds reasonable. But you may also need to think about maintenance. In case the index is corrupted somehow, you may also consider store the data into database, which are more easier to manually manipulate. -- Chris Lu - Instant Scalable Full-Text S

Re: How to avoid duplicate records in lucene

2008-07-23 Thread Chris Lu
This way, you can control what your "unique key" is. -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net demo: http://search.dbsight.com Lucene Database Search in 3 minutes: http://wiki.dbsight.com/

Re: too many clauses exception

2008-07-03 Thread Chris Lu
This is easy, use: BooleanQuery.setMaxClauseCount(4096); -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net demo: http://search.dbsight.com Lucene Database Search in 3 minutes: http://wiki.dbsight.com/index.php

Re: lucene query parser for double-worded term query

2008-06-24 Thread Chris Lu
Erick, Thanks! It's the analyzer problem. I should have used the same analyzer, KeywordAnalyzer, to create the query parser. Thanks a lot! -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net demo:

Re: lucene query parser for double-worded term query

2008-06-24 Thread Chris Lu
Yonik, Thanks for your quick reply! But I found after backslash escape spaces, both tags:San\ Francisco tags:"San\ Francisco" turns into PhraseQuery, just like tags:"San Francisco", still no results returned. Maybe Lucene Query Parser does not handle this

lucene query parser for double-worded term query

2008-06-24 Thread Chris Lu
like new TermQuery(new Term("tags", "San Francisco")) But how to achieve this via Lucene Query Parser? If using tags:"San Francisco" It's considered a phrase, and turned into term search of tags:San and tags:Francisco, which will not return results. Thank

Re: Simple Web Search

2008-06-16 Thread Chris Lu
Sounds you should use DBSight. Besides simple SQL crawler, you can adjust ranking by time(freshness), efficient multi-valued facet search(tagging), etc. -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net demo: http

Re: Concurrent query benchmarks

2008-06-10 Thread Chris Lu
Good work! I would like to see how it performs with several index reader instances, which is said to increase concurrency. -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net demo: http://search.dbsight.com Lucene

Re: slow FieldCacheImpl.createValue

2008-05-20 Thread Chris Lu
This should have a great boost to performance. Any plan to merge it into the main brance instead of patch? -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net demo: http://search.dbsight.com Lucene Database Search in 3

Re: Boosting Search

2008-05-16 Thread Chris Lu
You may need some more data to really compare the performance. >From previous experience, I would expect MySql's search time would increase as data grows, but Lucene's time stays almost unchanged. -- Chris Lu - Instant Scalable Full-Text Search On

how to change the segment file names

2008-05-07 Thread Chris Lu
gmant file names after it's built? Or, how to set a different starting counter for the segment file names? Thanks! -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net demo: http://search.dbsight.com Lucene Database S

Re: How to make a query that associates 2 index files

2008-05-06 Thread Chris Lu
cient for particular query execution paths. If you have special requirements, you will have to re-structure your index for performance. -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net demo: http://search.dbsight.com Lucen

Re: how to cache multivalued field using fieldcache.

2008-04-18 Thread Chris Lu
No. FieldCache is only for single-valued field. You would need to use your own data structure to cache multi-valued field. Or leave the index on disk and use Solid State Disk to read for faster access. -- Chris Lu - Instant Scalable Full-Text Search On Any Database

Re: Use of Lucene for DB Search

2008-04-10 Thread Chris Lu
Without changing your existing code, you can use DBSight free version to create a Lucene index on your database data, and provide search on it. It'll take you less time to get it going than reading all the manual or marketing materials. -- Chris Lu - Instant Sca

Re: Lucene vs. Database indexing (RE: Indexing and Searching from within a single Document)

2008-04-08 Thread Chris Lu
makes a lot of sense. The easiest way is to create a Lucene index, and apply range search on the index. -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net demo: http://search.dbsight.com Lucene Database Search in

Re: payload performance wrt fieldcache

2008-04-03 Thread Chris Lu
If your index size grows larger, payload method would be more slower. It's because Payload are read from hard disk. Fieldcache is in the memory, which is much faster. Unless you are going with Solid State Disk, you'd better go with Fieldcache for faster search. --

Re: factor in stopwords when searching

2008-03-22 Thread Chris Lu
top. And this should be what companies like Yahoo/Google are already doing, I guess. Can someone confirm this? -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net demo: http://search.dbsight.com Lucene Database Search in

  1   2   3   >