Sent: Tuesday, June 26, 2012 8:46 AM
> To: solr-user@lucene.apache.org
> Subject: Re: solr java.lang.NullPointerException on select queries
>
> So, I tried 'optimize', but it failed because of lack of space on the
> first
> machine. I then moved the whole thing to a differ
Well, you'd have to understand the whole way the index structure is laid
out to do binary editing, and I don't know it well enough to even offer
a rough idea. There are detailed docs hanging around _somewhere_ that
will give you the formats, or you could go at the code. But that's probably
pretty h
So, I tried 'optimize', but it failed because of lack of space on the first
machine. I then moved the whole thing to a different machine where the index
was pretty much the only thing and was using about 37% of disk, but it still
failed because of a "No space left on device" IOException. Also, the
Right, if you optimize, at the end maxDocs should == numDocs.
Usually the document reclamation stuff is done when segments
merge, but that won't happen in this case since this index is
becoming static, so a manual optimize is probably indicated.
Something like this should also work, either way:
ht
Erick, much thanks for detailing these options. I am currently trying the
second one as that seems a little easier and quicker to me.
I successfully deleted documents with IDs after the problem time that I do
know to an accuracy of a couple hours. Now, the stats are:
numDocs : 2132454075
maxDo
Ah, OK, I misunderstood. OK, here's a couple of off-the-top-of-my-head
ideas.
make a backup of your index before anything else ...
Split up your current index into two parts by segments. That is, copy the whole
directory to another place, and remove some of the segments from each. I.e.
when you'r
Erick, thanks for the advice, but let me make sure you haven't misunderstood
what I was asking.
I am not trying to split the huge existing index in install1 into shards. I
am also not trying to make the huge install1 index as one shard of a sharded
solr setup. I plan to use a sharded setup only fo
Don't even try to do that. First of all, you have to have a reliable way to
index the same docs to the same shards. The docs are all mixed up
in the segment files and would lead to chaos. Solr/Lucene report
the same doc multiple times if it's indifferent shards, so if you
ever updated a document, y
Thanks. Do you know if the tons of index files with names like '_zxt.tis' in
the index/data/ directory have the lucene IDs embedded in the binaries? The
files look good to me and are partly readable even if in binary. I am
wondering if I could just set up a new solr instance and move these index
fi
That indeed sucks. But I don't personally know of a good way to
try to split apart an existing index into shards. I'm afraid you're
going to be stuck with re-indexing
Wish I had a better solution
Erick
On Wed, Jun 20, 2012 at 10:45 AM, avenka wrote:
> Yes, wonky indeed.
> numDocs : -2006905
Yes, wonky indeed.
numDocs : -2006905329
maxDoc : -1993357870
And yes, I meant that the holes are in the database auto-increment ID space,
nothing to do with lucene IDs.
I will set up sharding. But is there any way to retrieve most of the current
index? Currently, all select queries even in
Let's make sure we're talking about the same thing. Solr happily
indexes and stores long (64) bit values, no problem. What it doesn't
do is assign _internal_ documents IDs as longs, those are ints.
on admin/statistics, look at maxDocs and numDocs. maxDocs +1 will be the
next _internal_ lucene doc
Erick, thanks for pointing that out. I was going to say in my original post
that it is almost like some limit on max documents got violated all of a
sudden, but the rest of the symptoms didn't seem to quite match. But now
that I think about it, the problem probably happened at 2B (corresponding
exa
Internal Lucene document IDs are signed 32 bit numbers, so having
2.5B docs seems to be just _asking_ for trouble. Which could
explain the fact that this just came out of thin air. If you kept adding
docs to the problem instance, you wouldn't have changed configs
etc, just added more docs
I re
For the first install, I copied over all files in the directory "example"
into, let's call it, "install1". I did the same for "install2". The two
installs run on different ports, use different jar files, are not really
related to each other in any way as far as I can see. In particular, they
are no
Can you tell us more about how you ran the second server? Is this an
independent installation or is it the same installation as your first
server just starting with a different port?
Because this is very strange. It half feels like there are conflicting
jars somewhere in your path, but that's a gu
16 matches
Mail list logo