Chris, my response is below each of your paragraphs...


I don't have the means to try out this code right now ... but i can't see
any obvious problems with it (there may be somewhere that you are opening
a stream or reader and not closing it, but i didn't see one) ... i notice
you are running this client on the same machine as Solr (hence the
localhost URLs) did you by any chance try running the client on a seperate
machine to see if hte number of updates before it hangs changes?


When I run the client locally and the Solr server on a slower and separate
development box, the maximum number of updates drops to 3,219. So it's
almost as if it's related to some sort of timeout problem because the
maximum number of updates drops considerably on a slower machine, but it's
weird how consistent the number is. 6,144 locally, 5,000 something when I
run it on the external server, and 3,219 when the client is separate from
the server.

my money is still on a filehandle resource limit somwhere ... if you are
running on a system that has "lsof" (on some Unix/Linux installations you
need sudo/su root permissions to run it) you can use "lsof -p ####" to
look up what files/network connections are open for a given process.  You
can try running that on both the client pid and the Solr server pid once
it's hung -- You'll probably see a lot of Jar files in use for both, but
if you see more then a few XML files open by the client, or more then a
1 TCP connection open by either the client or the server, there's your
culprit.


The only output I get from 'lsof -p' that pertains to TCP connections are
the following...I'm not too sure how to interpret it though:
java    4104 sangraal  261u  IPv6 0x5b060f0       0t0      TCP *:8009
(LISTEN)
java    4104 sangraal  262u  IPv6 0x55d59e8       0t0      TCP
[::127.0.0.1]:8005
(LISTEN)
java    4104 sangraal  263u  IPv6 0x53cc0e0       0t0      TCP [::127.0.0.1
]:http-alt->[::127.0.0.1]:51039 (ESTABLISHED)
java    4104 sangraal  264u  IPv6 0x5b059d0       0t0      TCP [::127.0.0.1
]:51045->[::127.0.0.1]:http-alt (ESTABLISHED)
java    4104 sangraal  265u  IPv6 0x53cc9c8       0t0      TCP [::127.0.0.1
]:http-alt->[::127.0.0.1]:51045 (ESTABLISHED)
java    4104 sangraal   11u  IPv6 0x5b04f20       0t0      TCP *:http-alt
(LISTEN)
java    4104 sangraal   12u  IPv6 0x5b06d68       0t0      TCP
localhost:51037->localhost:51036 (TIME_WAIT)

I'm not sure what Windows equivilent of lsof may exist.

Wait ... i just had another thought....

You are using InputStreamReader to deal with the InputStreams of your
remote XML files -- but you aren't specifying a charset, so it's using
your system default which may be differnet from the charset of the
orriginal XML files you are pulling from the URL -- which (i *think*)
means that your InputStreamReader may in some cases fail to read all of
the bytes of the stream, which might some dangling filehandles (i'm just
guessing on that part ... i'm not acctually sure whta happens in that
case).

What if you simplify your code (for the purposes of testing) and just put
the post-transform version ganja-full.xml in a big ass String variable in
your java app and just call GanjaUpdate.doUpdate(bigAssString) over and
over again ... does that cause the same problem?


In the code, I read the XML with a StringReader and then pass it to
GanjaUpdate as a string anyway.  I've output the String object and verified
that it is in fact all there.


-Sangraal

Reply via email to