How are you creating the Content objects for the
documents you're inserting?  You're not loading each
one into memory are you?  If you instantiate the
Content object with a File or RandomAccessFile handle,
XCC will read the file from disk and pass it through
without needing to load the whole document into memory.

On Mar 16, 2010, at 8:54 PM, Lee, David wrote:

> FYI I added XX:+UseConcMarkSweepGC  and was able to load the 30k files 
> without error ...
> still way too slow (not sure why yet but only about 2 docs/sec)... but 
> atleast it didn't fail.
> Will try again tonight to see if my success is reproducible but I have good 
> hopes this was the problem.
> 
> 
> 
> -----Original Message-----
> From: general-boun...@developer.marklogic.com 
> [mailto:general-boun...@developer.marklogic.com] On Behalf Of Lee, David
> Sent: Monday, March 15, 2010 1:31 PM
> To: Michael Blakeley; General Mark Logic Developer Discussion
> Subject: [MarkLogic Dev General] RE: [MarkLogicDev 
> General]ServerConnectionException-consistantly afterabout 20, 000 files
> 
> Thanks Mike, great idea about the GC.
> I have JProfiler so I can do good  detailed measurements ... just takes a 
> while because the errors dont start showing up for 2+ hours ... but good path 
> to investigate !
> Thank you
> 
> -David
> 
> 
> -----Original Message-----
> From: Michael Blakeley [mailto:michael.blake...@marklogic.com]
> Sent: Monday, March 15, 2010 1:18 PM
> To: General Mark Logic Developer Discussion
> Cc: Lee, David
> Subject: Re: [MarkLogic Dev General]ServerConnectionException-consistantly 
> after about 20, 000 files
> 
> Lee, note that that's TIME_WAIT *not* TIMED_WAIT, and there's no need to 
> check the server, just the client. Any TIME_WAIT sockets will disappear 
> fairly quickly: the best way to check is during the test itself.
> 
> Have you thought about garbage collection? The fact that the error occurs 
> after a given number of inserts is suggestive. If the GC thread locks 
> everything else off the CPU for a long enough period of time, the server will 
> time out connections. This is especially likely to happen if the program 
> working set is a large percentage of the java heap size (which may in turn 
> indicate either leaks that you could fix, or a need for a larger heap).
> 
> You might try instrumenting your code to report insert times in Java, and 
> also report the elapsed time when you see exceptions. Then monitor the java 
> process size as your program runs. You may be able to correlate longer 
> elapsed times for inserts with GC events, which would tend to confirm this 
> hypothesis.
> 
> When using RecordLoader, XQSync, and Corb with large content sets, I 
> generally use the -XX:+UseConcMarkSweepGC VM option. Sometimes I also raise 
> the max heap size, but some care is required because too large of a heap 
> seems to slow things down.
> 
> If GC and memory turns out to be involved, I also recommend looking at the 
> whole Java program carefully, with an eye toward minimizing memory 
> utilization and especially toward removing any object leaks. If you are 
> leaking objects, then memory will fill up sooner or later no matter what GC 
> does. There are some good Java profiling tools available to help with this.
> 
> -- Mike
> 
> On 2010-03-15 07:08, Lee, David wrote:
>> Thanks Ron, I'm doing all the things you suggest already
>> 1) Reusing a Session
>> 2) bundling 20 files in 1 insertContent()
>> 3) Checked netstat and there are no TIMED_WAIT connections on either 
>> client or server
>> 
>> I'm trying something different this time which is to use a thread pool 
>> to try to increase effeciency of sending the files.  Maybe this will 
>> be worse on the system I dont know.
>> Maybe there is some kind of maximum session open time ?
>> The error occur about 2 hours into the transfer typically.
>> I could try closing and reopening the session every hour ...
>> 
>> -David
>> Server: 4.1-4 on Fedorah fc 11
>> Client: XP/Pro SP3 and Windows 7
>> XCC: Latest from download
>> 
>> 
>> -----Original Message-----
>> From: general-boun...@developer.marklogic.com
>> [mailto:general-boun...@developer.marklogic.com] On Behalf Of Ron 
>> Hitchens
>> Sent: Monday, March 15, 2010 4:49 AM
>> To: General Mark Logic Developer Discussion
>> Subject: Re: [MarkLogic Dev
>> General]ServerConnectionException-consistantly after about 20, 000 
>> files
>> 
>> 
>>    You may be filling up your OS's file table.  When a socket is 
>> closed, the OS holds onto it for a while (default usually about two 
>> minutes) to reliably detect any straggler packets.
>> 
>>    When you cycle a lot of connection quickly, this can max out 
>> internal data structures in the OS.  If you do a netstat and see 
>> zillions of connections in TIME_WAIT state, that's probably what's 
>> happening.
>> 
>>    If you're connecting across a LAN, this delay is not really 
>> needed, because it's hard for packet to get rerouted anywhere else.
>> You can tune the socket wait time down to 5 seconds or so and that 
>> will allow file table slots to be re-used more quickly.
>> 
>>    You can also insert multiple documents per request, all of which 
>> will be transferred together and result in fewer low-level sockets 
>> being opened.
>> 
>> On Mar 14, 2010, at 9:28 PM, Lee, David wrote:
>> 
>>> FYI, here's a stack trace from the same program but in this case its
>> the query component under load.
>>> This is very consistent as well after about 10 -20k requests
>>> 
>>> 
>>> com.marklogic.xcc.exceptions.ServerConnectionException: Error parsing
>> HTTP headers: Premature EOF, partial header line read: ''
>>>  [Session: user=DLEE, cb={default} [ContentSource: user=DLEE,
>> cb={none} [provider: address=home/192.168.1.10:8011, pool=0/64]]]
>>>                 at
>> com.marklogic.xcc.impl.handlers.AbstractRequestController.runRequest(A
>> bs
>> tractRequestController.java:99)
>>>                 at
>> com.marklogic.xcc.impl.SessionImpl.submitRequest(SessionImpl.java:280)
>>>                 at org.xmlsh.marklogic.put.setChecksum(put.java:341)
>>>                 at org.xmlsh.marklogic.put.flushContent(put.java:315)
>>>                 at org.xmlsh.marklogic.put.putContent(put.java:288)
>>>                 at org.xmlsh.marklogic.put.load(put.java:272)
>>>                 at org.xmlsh.marklogic.put.load(put.java:266)
>>>                 at org.xmlsh.marklogic.put.run(put.java:126)
>>>                 at org.xmlsh.core.XCommand.run(XCommand.java:86)
>>>                 at org.xmlsh.core.XCommand.run(XCommand.java:63)
>>>                 at
>> org.xmlsh.sh.core.SimpleCommand.exec(SimpleCommand.java:121)
>>>                 at org.xmlsh.sh.shell.Shell.exec(Shell.java:560)
>>>                 at org.xmlsh.sh.core.Pipeline.exec(Pipeline.java:124)
>>>                 at org.xmlsh.sh.shell.Shell.exec(Shell.java:560)
>>>                 at org.xmlsh.sh.shell.Shell.runScript(Shell.java:362)
>>>                 at
>> org.xmlsh.core.ScriptCommand.run(ScriptCommand.java:75)
>>>                 at
>> org.xmlsh.sh.core.SimpleCommand.exec(SimpleCommand.java:121)
>>>                 at org.xmlsh.sh.shell.Shell.exec(Shell.java:560)
>>>                 at org.xmlsh.sh.core.Pipeline.exec(Pipeline.java:124)
>>>                 at org.xmlsh.sh.shell.Shell.exec(Shell.java:560)
>>>                 at org.xmlsh.sh.shell.Shell.runScript(Shell.java:362)
>>>                 at
>> org.xmlsh.core.ScriptCommand.run(ScriptCommand.java:75)
>>>                 at
>> org.xmlsh.sh.core.SimpleCommand.exec(SimpleCommand.java:121)
>>>                 at org.xmlsh.sh.shell.Shell.exec(Shell.java:560)
>>>                 at org.xmlsh.sh.core.Pipeline.exec(Pipeline.java:124)
>>>                 at org.xmlsh.sh.shell.Shell.exec(Shell.java:560)
>>>                 at
>> org.xmlsh.sh.shell.Shell.interactive(Shell.java:461)
>>>                 at org.xmlsh.commands.builtin.xmlsh.run(xmlsh.java:82)
>>>                 at
>> org.xmlsh.core.BuiltinCommand.run(BuiltinCommand.java:54)
>>>                 at org.xmlsh.sh.shell.Shell.main(Shell.java:690)
>>> Caused by: java.io.IOException: Error parsing HTTP headers: Premature
>> EOF, partial header line read: ''
>>>                 at
>> com.marklogic.http.HttpHeaders.nextHeaderLine(HttpHeaders.java:326)
>>>                 at
>> com.marklogic.http.HttpHeaders.parseResponseHeaders(HttpHeaders.java:2
>> 87
>> )
>>>                 at
>> com.marklogic.http.HttpChannel.parseHeaders(HttpChannel.java:323)
>>>                 at
>> com.marklogic.http.HttpChannel.receiveMode(HttpChannel.java:293)
>>>                 at
>> com.marklogic.http.HttpChannel.getResponseCode(HttpChannel.java:187)
>>>                 at
>> com.marklogic.xcc.impl.handlers.EvalRequestController.issueRequest(Eva
>> lR
>> equestController.java:111)
>>>                 at
>> com.marklogic.xcc.impl.handlers.EvalRequestController.serverDialog(Eva
>> lR
>> equestController.java:62)
>>>                 at
>> com.marklogic.xcc.impl.handlers.AbstractRequestController.runRequest(A
>> bs
>> tractRequestController.java:72)
>>>                 ... 29 more
>>> 
>>> 
>>> 
>>> From: general-boun...@developer.marklogic.com
>> [mailto:general-boun...@developer.marklogic.com] On Behalf Of Lee, 
>> David
>>> Sent: Saturday, March 13, 2010 7:42 PM
>>> To: General Mark Logic Developer Discussion
>>> Subject: RE: [MarkLogic Dev General]
>> ServerConnectionException-consistantly after about 20, 000 files
>>> 
>>> Here's a full stack trace, including my code in the stack.
>>> by "opening connections" I mean calling
>>> 
>>>      URI serverUri = new URI (connect);
>>>      ContentSource cs = ContentSourceFactory.newContentSource
>> (serverUri);
>>> 
>>> for ever file instead of reusing the ContentSource for all files.
>>> Although that may be a red-herring ... when I do it that way (new
>> Content Source for each file) I'm not aborting the push operation if 
>> one file fails so I may be missing these errors in that case.
>>> 
>>> --------- Stack Trace
>>> 
>>> 
>>> 
>>> 2010-03-13 16:17:13,748 12310138 ERROR [main] core.SimpleCommand -
>> Exception running command: ml:put
>>> com.marklogic.xcc.exceptions.ServerConnectionException: An 
>>> established
>> connection was aborted by the software in your host machine
>>>  [Session: user=DLEE, cb={default} [ContentSource: user=DLEE,
>> cb={none} [provider: address=home/192.168.1.10:8011, pool=0/64]]]
>>>                 at
>> com.marklogic.xcc.impl.handlers.AbstractRequestController.runRequest(A
>> bs
>> tractRequestController.java:99)
>>>                 at
>> com.marklogic.xcc.impl.SessionImpl.insertContent(SessionImpl.java:204)
>>>                 at org.xmlsh.marklogic.put.load(put.java:180)
>>>                 at org.xmlsh.marklogic.put.load(put.java:171)
>>>                 at org.xmlsh.marklogic.put.run(put.java:99)
>>>                 at org.xmlsh.core.XCommand.run(XCommand.java:86)
>>>                 at org.xmlsh.core.XCommand.run(XCommand.java:63)
>>>                 at
>> org.xmlsh.sh.core.SimpleCommand.exec(SimpleCommand.java:121)
>>>                 at org.xmlsh.sh.shell.Shell.exec(Shell.java:560)
>>>                 at org.xmlsh.sh.core.Pipeline.exec(Pipeline.java:124)
>>>                 at org.xmlsh.sh.shell.Shell.exec(Shell.java:560)
>>>                 at org.xmlsh.sh.shell.Shell.runScript(Shell.java:362)
>>>                 at
>> org.xmlsh.core.ScriptCommand.run(ScriptCommand.java:75)
>>>                 at
>> org.xmlsh.sh.core.SimpleCommand.exec(SimpleCommand.java:121)
>>>                 at org.xmlsh.sh.shell.Shell.exec(Shell.java:560)
>>>                 at org.xmlsh.sh.core.Pipeline.exec(Pipeline.java:124)
>>>                 at org.xmlsh.sh.shell.Shell.exec(Shell.java:560)
>>>                 at
>> org.xmlsh.sh.shell.Shell.interactive(Shell.java:461)
>>>                 at org.xmlsh.commands.builtin.xmlsh.run(xmlsh.java:82)
>>>                 at
>> org.xmlsh.core.BuiltinCommand.run(BuiltinCommand.java:54)
>>>                 at org.xmlsh.sh.shell.Shell.main(Shell.java:690)
>>> Caused by: java.io.IOException: An established connection was aborted
>> by the software in your host machine
>>>                 at sun.nio.ch.SocketDispatcher.write0(Native Method)
>>>                 at
>> sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:33)
>>>                 at
>> sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:104)
>>>                 at sun.nio.ch.IOUtil.write(IOUtil.java:60)
>>>                 at
>> sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:334)
>>>                 at
>> com.marklogic.http.HttpChannel.writeBuffer(HttpChannel.java:373)
>>>                 at
>> com.marklogic.http.HttpChannel.writeBody(HttpChannel.java:353)
>>>                 at
>> com.marklogic.http.HttpChannel.flushRequest(HttpChannel.java:346)
>>>                 at
>> com.marklogic.http.HttpChannel.write(HttpChannel.java:134)
>>>                 at
>> com.marklogic.xcc.impl.handlers.ContentInsertController.writeChunkHead
>> er
>> (ContentInsertController.java:299)
>>>                 at
>> com.marklogic.xcc.impl.handlers.ContentInsertController.issueRequest(C
>> on
>> tentInsertController.java:210)
>>>                 at
>> com.marklogic.xcc.impl.handlers.ContentInsertController.serverDialog(C
>> on
>> tentInsertController.java:112)
>>>                 at
>> com.marklogic.xcc.impl.handlers.AbstractRequestController.runRequest(A
>> bs
>> tractRequestController.java:72)
>>>                 ... 20 more
>>> 
>>> 
>>> 
>>> 
>>> 
>>> From: general-boun...@developer.marklogic.com
>> [mailto:general-boun...@developer.marklogic.com] On Behalf Of Sam Neth
>>> Sent: Saturday, March 13, 2010 6:08 PM
>>> To: General Mark Logic Developer Discussion
>>> Subject: Re: [MarkLogic Dev General] ServerConnectionException
>> -consistantly after about 20, 000 files
>>> 
>>> Could you post a stack trace?
>>> 
>>> What version of XCC are you using?
>>> 
>>> What specifically are you referring to when you talk about "opening
>> connections"?
>>> 
>>> On Mar 13, 2010, at 2:33 PM, Lee, David wrote:
>>> 
>>> 
>>> If I use XCC to iteratively insert a large set of documents I
>> consistently get this error
>>> 
>>> com.marklogic.xcc.exceptions.ServerConnectionException: An 
>>> established
>> connectin was aborted by the software in your host machine [Session:
>> user=DLEE, cb={default} [ContentSource: user=DLEE, cb={none} [providr:
>> address=home/192.168.1.10:8011, pool=0/64]]]
>>> 
>>> 
>>> This occurs after about 20,000 files and aborts the program.
>>> I'm thinking of implementing a exception handler to retry but I dont
>> want to be retrying after more serious errors.
>>> The server log doesnt show any problems, and this is on a dedicated
>> 1GB wired LAN so I dont think its internet problems.
>>> 
>>> If instead of using the same connection I open the connection for 
>>> each
>> file it often gets around this problem, but not always,
>>> I think its getting around it because I'm not aborting on error in
>> that case (just going to the next file).
>>> 
>>> I'm using this code snippet to create the content in bulks of 1-20 (
>> files in a directory )
>>> 
>>> Content content= ContentFactory.newContent (uri, file,
>> mCreateOptions);
>>> contents.add(content);
>>> ...
>>> 
>>> if( ! contents.isEmpty() )
>>>      session.insertContent (contents.toArray(new Content[
>> contents.size()]));
>>> 
>>> 
>>> 
>>> Any suggestions ?
>>> 
>>> 
>>> 
>>> ----------------------------------------
>>> David A. Lee
>>> Senior Principal Software Engineer
>>> Epocrates, Inc.
>>> d...@epocrates.com
>>> 812-482-5224
>>> 
>>> _______________________________________________
>>> General mailing list
>>> General@developer.marklogic.com
>>> http://xqzone.com/mailman/listinfo/general
>>> 
>>> _______________________________________________
>>> General mailing list
>>> General@developer.marklogic.com
>>> http://xqzone.com/mailman/listinfo/general
>> 
>> ---
>> Ron Hitchens {mailto:r...@ronsoft.com}   Ronsoft Technologies
>>      (650) 766-2355 (Home Office)       http://www.ronsoft.com
>>      (707) 924-3878 (fax)               Bit Twiddling At Its Finest
>> "No amount of belief establishes any fact." -Unknown
>> 
>> 
>> 
>> 
>> 
>> _______________________________________________
>> General mailing list
>> General@developer.marklogic.com
>> http://xqzone.com/mailman/listinfo/general
>> _______________________________________________
>> General mailing list
>> General@developer.marklogic.com
>> http://xqzone.com/mailman/listinfo/general
> 
> _______________________________________________
> General mailing list
> General@developer.marklogic.com
> http://xqzone.com/mailman/listinfo/general

---
Ron Hitchens {mailto:r...@ronsoft.com}   Ronsoft Technologies
     (650) 766-2355 (Home Office)       http://www.ronsoft.com
     (707) 924-3878 (fax)               Bit Twiddling At Its Finest
"No amount of belief establishes any fact." -Unknown





_______________________________________________
General mailing list
General@developer.marklogic.com
http://xqzone.com/mailman/listinfo/general

Reply via email to