AM. I use just FSDir.
Best regards,
Artem.
SS> Supposed I want to index 500,000 documents (average document size is
SS> 4kBs). Let's assume I create a single index and that the index is
SS> static (I'm not going to add any new documents to it). I would guess
SS> the index wo
dSortFactory.create(sortFieldName, sortDescending) to get
Sort object for sorting query.
StoredFieldSortFactory source file can be extracted from LUCENE-769 patch or
from sharehound sources:
http://sharehound.cvs.sourceforge.net/*checkout*/sharehound/jNetCrawler/src/java/org/apache/lucene/search/S
e(sortFieldName, sortDescending) to get
Sort object for sorting query.
StoredFieldSortFactory source file can be extracted from LUCENE-769 patch or
from sharehound sources:
http://sharehound.cvs.sourceforge.net/*checkout*/sharehound/jNetCrawler/src/java/org/apache/lucene/search/StoredFieldSortFactory.java
Regard
n you please explain what sorted
NB> queries mean? Is simple keyword search a sorted query?
That's simple - if results presented on screen sorted by that keyword it's
sorted query :)
Another test is your system's code. Sorted queries I mean are calls to
IndexSearcher.search(q
uld be successfully
applied to multiple fields, but I'm not going to implement it yet. May be you?
:)
Regards,
Artem
IV> Hi Artem,
IV> Thank you very much for your mails :)
IV> So first I have to tell you that your patch works perfectly even with
IV> very big indexes - 40 GB (you c
, Integer.MIN_VALUE, Integer.MAX_VALUE,
true, true) for now) to find documents with field present?
2. Can one use DocValues effectively instead of Stored Fields to show found
documents? Or I should use UninvertingReader for fields that are not in
DocValues?
Thanks!
--
Artem Redkin
artemred
ourceforge.net/*checkout*/sharehound/jNetCrawler/src/java/org/sourceforge/sharehound/lucene/FilesSearchCommandImpl.java
Best regards,
Artem.
OG> Artem & Co.,
OG> Have you benchmarked this approach against the typical non-caching
OG> Sort? Both performance and memory benchmark?
O
AM. I use just FSDir.
Best regards,
Artem.
SS> Supposed I want to index 500,000 documents (average document size is
SS> 4kBs). Let's assume I create a single index and that the index is
SS> static (I'm not going to add any new documents to it). I would guess
SS> the index wo
string of maybe 50
symbols average). The machine looks quite beefy to me - Intel core duo with
500M given to the application.
Regards,
Artem
On 4/23/07, Ivan Vasilev <[EMAIL PROTECTED]> wrote:
Hi All,
THANK YOU FOR YOUR HELP :)
I put this problem in the forum but I had no chance to
.
Regards,
Artem
On 4/24/07, Artem Vasiliev <[EMAIL PROTECTED]> wrote:
Hello Ivan!
It's so sad to me that you had bad results with that patch. :)
The discussion in the ticket is out-of-date - the patch was initially in
several classes, used WeakHashMap but then it evolved to what it&
Hi Ivan!
btw may be forbidding the sorted search in case of too many results is an
option? I did this way in my case.
Regards,
Artem.
On 4/24/07, Artem Vasiliev <[EMAIL PROTECTED]> wrote:
Ahhh, you said in your original post that your search matches _all_ the
results.. Yup my patch wi
now
search SMB file shares in LANs by their pathes and names. It tracks
changes in directories so it even knows about deleted files. The
application is in alpha now but it's working, it has Web UI and RSS
subscription for query results (added today :), so I'll be glad if
it help somebody h
now
XD> search SMB file shares in LANs by their pathes and names. It tracks
XD> changes in directories so it even knows about deleted files. The
XD> application is in alpha now but it's working, it has Web UI and RSS
XD> subscription for query results (added today
ou sort on a field, a FieldCache entry is populated,
YS> enabling random access to that field value. A single int field for a
YS> 4M index == int[400] == 16MB memory.
--
Best regards,
Artemmailto:[EMAIL PROTECTED]
---
,
CH> filename" .. this should reduce the size quite a bit if the number of
--
Best regards,
Artem
http://sharehound.sourceforge.net sharehound, the open source filesystems
indexer
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
create your own
YS> FieldCache that doesn't create/store that String[].
The int[] array here contains references to String[] and to populate
it still all the field values need to be loaded and compared/sorted
which is what I want to avoid. I guess my option is not to us
ly search resultsets it's almost as fast as the default
implementation. Note that this solution is ready only for single-field
sorting currently.
Best regards,
Artem
OG> This is unlikely to work well/fast. It will depend on the
OG> size of the index (not in terms of the number of docs
way
to restore the original docID.
--
Thanks in advance,
Artem.
siteReader...
Is there a better way?
Thanks,
Artem.
On Fri, Mar 21, 2014 at 6:33 PM, Oliver Christ wrote:
> Can you split your corpus across multiple Lucene instances?
>
> Cheers, Oli
>
> -Original Message-
> From: Artem Gayardo-Matrosov [mailto:ar...@gayardo.com]
>
and only solution to this
problem.
Artem.
On Fri, Mar 21, 2014 at 7:29 PM, Jack Krupansky wrote:
> Every word occurrence or every unique word? I mean Integer.MAX_VALUE like
> 2 billion. Even the OED only has 600,000 words defined. The former doesn't
> sound like a good use case m
20 matches
Mail list logo