Jack,
What you say sounds hopeful, but it also sounds like quite some work
to define/select the correct analyzer for each type of programming
language (we use SQL, PL/SQL, Java and C# mainly). Compared to what I
do know which is just to throw all files at Glimpse and it makes them
searchable in a
Glimpse seems to use something similar like StandardAnalyzer. So I would give
it a try. For program code this should work quite good. To make the
auto-phrases work (which might be a good idea here, too), enable this feature
in the query parser (I am referring to the comment by Jack about
Hi!
I wonder where one can get information about current Lucene (v 4.1) core search
classes - AtomicReader, CompositeReader, ReaderContexts - and how to use them
properly for building custom search algorithms.
Although Lucene in Action is really good, I can't find something on these
classes
Hello,
I want to implement a central index, and I heard about Lucene, so I would like
to ask your help to install it and configure it. My OS is Windows 7/XP/Server
2008. If I could index just one database and make a search I would be happy.
I would be grateful if you can send me any info about
You're probably better off using Solr which is tightly linked with lucene.
http://lucene.apache.org/solr/
I'm sure there are installation and getting started guides there.
--
Ian.
On Tue, Feb 5, 2013 at 12:58 PM, Álvaro Vargas Quezada
al...@outlook.com wrote:
Hello,
I want to implement a
Hi,
For Basics on Lucene How to Create Lucene Index and some basic Stuffs
Look in to Lucene in Action Book.
On Tue, Feb 5, 2013 at 6:28 PM, Álvaro Vargas Quezada al...@outlook.com wrote:
Hello,
I want to implement a central index, and I heard about Lucene, so I would
like to ask your help
I am looking at the versions supported by newer version of Tika (1.3) and was
not sure what version(s) of the Microsoft office it supports
(97/2000/2010/2013) for each of the below?
http://tika.apache.org/1.3/formats.html#Microsoft_Office_document_formats
Microsoft word (also does it support
Hey Guys,
I'm trying to figure out what would be a better approach to indexing when it
comes to a large number of records (say 1 billion).
As far as queries:
1. Only support exact matches (a field is equal to some constant value) or
range matches (a field is larger/smaller than some constant
Part of the answer depends on what kind of records you have. For instance,
are you dealing with a lot of numeric data?
If you need all those functions and only want to support exact matches and
basic boolean comparisons, then I'd go with a RDBMS instead of Lucene.
You'll get better support for
The records are mostly logging events where they will have:
1. a timestamp
2. the type of the event
3. potentially a set of key/value properties
Then I would want to be able to slice and dice the records based on time
(required), type and/or the key/values.
In addition, I would want to have
Thanks for the input! Seems I should give this another chance using
the hints you all sent me. I'll report back my findings here.
/Mathias
On Mon, Feb 4, 2013 at 7:01 PM, Mathias Dahl mathias.d...@gmail.com wrote:
Hi,
I have hacked together a small web front end to the Glimpse text
indexing
So you probably should ask your question to the Elasticsearch mailing list.
I think that some ES users already scales to x billion docs.
Even if ES is Lucene based, it adds features to scale out (sharding,
routing...).
HTH
--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
Le
Here's another thought: if you desperately need complex searches then
you could do a heuristic filtering to narrow down the search: use an
analyzer that does some form of input splitting into terms (removing
excess whitespace or even producing n-grams from the input), then do
the same for the
13 matches
Mail list logo