How are you creating the Content objects for the documents you're inserting? You're not loading each one into memory are you? If you instantiate the Content object with a File or RandomAccessFile handle, XCC will read the file from disk and pass it through without needing to load the whole document into memory.
On Mar 16, 2010, at 8:54 PM, Lee, David wrote: > FYI I added XX:+UseConcMarkSweepGC and was able to load the 30k files > without error ... > still way too slow (not sure why yet but only about 2 docs/sec)... but > atleast it didn't fail. > Will try again tonight to see if my success is reproducible but I have good > hopes this was the problem. > > > > -----Original Message----- > From: general-boun...@developer.marklogic.com > [mailto:general-boun...@developer.marklogic.com] On Behalf Of Lee, David > Sent: Monday, March 15, 2010 1:31 PM > To: Michael Blakeley; General Mark Logic Developer Discussion > Subject: [MarkLogic Dev General] RE: [MarkLogicDev > General]ServerConnectionException-consistantly afterabout 20, 000 files > > Thanks Mike, great idea about the GC. > I have JProfiler so I can do good detailed measurements ... just takes a > while because the errors dont start showing up for 2+ hours ... but good path > to investigate ! > Thank you > > -David > > > -----Original Message----- > From: Michael Blakeley [mailto:michael.blake...@marklogic.com] > Sent: Monday, March 15, 2010 1:18 PM > To: General Mark Logic Developer Discussion > Cc: Lee, David > Subject: Re: [MarkLogic Dev General]ServerConnectionException-consistantly > after about 20, 000 files > > Lee, note that that's TIME_WAIT *not* TIMED_WAIT, and there's no need to > check the server, just the client. Any TIME_WAIT sockets will disappear > fairly quickly: the best way to check is during the test itself. > > Have you thought about garbage collection? The fact that the error occurs > after a given number of inserts is suggestive. If the GC thread locks > everything else off the CPU for a long enough period of time, the server will > time out connections. This is especially likely to happen if the program > working set is a large percentage of the java heap size (which may in turn > indicate either leaks that you could fix, or a need for a larger heap). > > You might try instrumenting your code to report insert times in Java, and > also report the elapsed time when you see exceptions. Then monitor the java > process size as your program runs. You may be able to correlate longer > elapsed times for inserts with GC events, which would tend to confirm this > hypothesis. > > When using RecordLoader, XQSync, and Corb with large content sets, I > generally use the -XX:+UseConcMarkSweepGC VM option. Sometimes I also raise > the max heap size, but some care is required because too large of a heap > seems to slow things down. > > If GC and memory turns out to be involved, I also recommend looking at the > whole Java program carefully, with an eye toward minimizing memory > utilization and especially toward removing any object leaks. If you are > leaking objects, then memory will fill up sooner or later no matter what GC > does. There are some good Java profiling tools available to help with this. > > -- Mike > > On 2010-03-15 07:08, Lee, David wrote: >> Thanks Ron, I'm doing all the things you suggest already >> 1) Reusing a Session >> 2) bundling 20 files in 1 insertContent() >> 3) Checked netstat and there are no TIMED_WAIT connections on either >> client or server >> >> I'm trying something different this time which is to use a thread pool >> to try to increase effeciency of sending the files. Maybe this will >> be worse on the system I dont know. >> Maybe there is some kind of maximum session open time ? >> The error occur about 2 hours into the transfer typically. >> I could try closing and reopening the session every hour ... >> >> -David >> Server: 4.1-4 on Fedorah fc 11 >> Client: XP/Pro SP3 and Windows 7 >> XCC: Latest from download >> >> >> -----Original Message----- >> From: general-boun...@developer.marklogic.com >> [mailto:general-boun...@developer.marklogic.com] On Behalf Of Ron >> Hitchens >> Sent: Monday, March 15, 2010 4:49 AM >> To: General Mark Logic Developer Discussion >> Subject: Re: [MarkLogic Dev >> General]ServerConnectionException-consistantly after about 20, 000 >> files >> >> >> You may be filling up your OS's file table. When a socket is >> closed, the OS holds onto it for a while (default usually about two >> minutes) to reliably detect any straggler packets. >> >> When you cycle a lot of connection quickly, this can max out >> internal data structures in the OS. If you do a netstat and see >> zillions of connections in TIME_WAIT state, that's probably what's >> happening. >> >> If you're connecting across a LAN, this delay is not really >> needed, because it's hard for packet to get rerouted anywhere else. >> You can tune the socket wait time down to 5 seconds or so and that >> will allow file table slots to be re-used more quickly. >> >> You can also insert multiple documents per request, all of which >> will be transferred together and result in fewer low-level sockets >> being opened. >> >> On Mar 14, 2010, at 9:28 PM, Lee, David wrote: >> >>> FYI, here's a stack trace from the same program but in this case its >> the query component under load. >>> This is very consistent as well after about 10 -20k requests >>> >>> >>> com.marklogic.xcc.exceptions.ServerConnectionException: Error parsing >> HTTP headers: Premature EOF, partial header line read: '' >>> [Session: user=DLEE, cb={default} [ContentSource: user=DLEE, >> cb={none} [provider: address=home/192.168.1.10:8011, pool=0/64]]] >>> at >> com.marklogic.xcc.impl.handlers.AbstractRequestController.runRequest(A >> bs >> tractRequestController.java:99) >>> at >> com.marklogic.xcc.impl.SessionImpl.submitRequest(SessionImpl.java:280) >>> at org.xmlsh.marklogic.put.setChecksum(put.java:341) >>> at org.xmlsh.marklogic.put.flushContent(put.java:315) >>> at org.xmlsh.marklogic.put.putContent(put.java:288) >>> at org.xmlsh.marklogic.put.load(put.java:272) >>> at org.xmlsh.marklogic.put.load(put.java:266) >>> at org.xmlsh.marklogic.put.run(put.java:126) >>> at org.xmlsh.core.XCommand.run(XCommand.java:86) >>> at org.xmlsh.core.XCommand.run(XCommand.java:63) >>> at >> org.xmlsh.sh.core.SimpleCommand.exec(SimpleCommand.java:121) >>> at org.xmlsh.sh.shell.Shell.exec(Shell.java:560) >>> at org.xmlsh.sh.core.Pipeline.exec(Pipeline.java:124) >>> at org.xmlsh.sh.shell.Shell.exec(Shell.java:560) >>> at org.xmlsh.sh.shell.Shell.runScript(Shell.java:362) >>> at >> org.xmlsh.core.ScriptCommand.run(ScriptCommand.java:75) >>> at >> org.xmlsh.sh.core.SimpleCommand.exec(SimpleCommand.java:121) >>> at org.xmlsh.sh.shell.Shell.exec(Shell.java:560) >>> at org.xmlsh.sh.core.Pipeline.exec(Pipeline.java:124) >>> at org.xmlsh.sh.shell.Shell.exec(Shell.java:560) >>> at org.xmlsh.sh.shell.Shell.runScript(Shell.java:362) >>> at >> org.xmlsh.core.ScriptCommand.run(ScriptCommand.java:75) >>> at >> org.xmlsh.sh.core.SimpleCommand.exec(SimpleCommand.java:121) >>> at org.xmlsh.sh.shell.Shell.exec(Shell.java:560) >>> at org.xmlsh.sh.core.Pipeline.exec(Pipeline.java:124) >>> at org.xmlsh.sh.shell.Shell.exec(Shell.java:560) >>> at >> org.xmlsh.sh.shell.Shell.interactive(Shell.java:461) >>> at org.xmlsh.commands.builtin.xmlsh.run(xmlsh.java:82) >>> at >> org.xmlsh.core.BuiltinCommand.run(BuiltinCommand.java:54) >>> at org.xmlsh.sh.shell.Shell.main(Shell.java:690) >>> Caused by: java.io.IOException: Error parsing HTTP headers: Premature >> EOF, partial header line read: '' >>> at >> com.marklogic.http.HttpHeaders.nextHeaderLine(HttpHeaders.java:326) >>> at >> com.marklogic.http.HttpHeaders.parseResponseHeaders(HttpHeaders.java:2 >> 87 >> ) >>> at >> com.marklogic.http.HttpChannel.parseHeaders(HttpChannel.java:323) >>> at >> com.marklogic.http.HttpChannel.receiveMode(HttpChannel.java:293) >>> at >> com.marklogic.http.HttpChannel.getResponseCode(HttpChannel.java:187) >>> at >> com.marklogic.xcc.impl.handlers.EvalRequestController.issueRequest(Eva >> lR >> equestController.java:111) >>> at >> com.marklogic.xcc.impl.handlers.EvalRequestController.serverDialog(Eva >> lR >> equestController.java:62) >>> at >> com.marklogic.xcc.impl.handlers.AbstractRequestController.runRequest(A >> bs >> tractRequestController.java:72) >>> ... 29 more >>> >>> >>> >>> From: general-boun...@developer.marklogic.com >> [mailto:general-boun...@developer.marklogic.com] On Behalf Of Lee, >> David >>> Sent: Saturday, March 13, 2010 7:42 PM >>> To: General Mark Logic Developer Discussion >>> Subject: RE: [MarkLogic Dev General] >> ServerConnectionException-consistantly after about 20, 000 files >>> >>> Here's a full stack trace, including my code in the stack. >>> by "opening connections" I mean calling >>> >>> URI serverUri = new URI (connect); >>> ContentSource cs = ContentSourceFactory.newContentSource >> (serverUri); >>> >>> for ever file instead of reusing the ContentSource for all files. >>> Although that may be a red-herring ... when I do it that way (new >> Content Source for each file) I'm not aborting the push operation if >> one file fails so I may be missing these errors in that case. >>> >>> --------- Stack Trace >>> >>> >>> >>> 2010-03-13 16:17:13,748 12310138 ERROR [main] core.SimpleCommand - >> Exception running command: ml:put >>> com.marklogic.xcc.exceptions.ServerConnectionException: An >>> established >> connection was aborted by the software in your host machine >>> [Session: user=DLEE, cb={default} [ContentSource: user=DLEE, >> cb={none} [provider: address=home/192.168.1.10:8011, pool=0/64]]] >>> at >> com.marklogic.xcc.impl.handlers.AbstractRequestController.runRequest(A >> bs >> tractRequestController.java:99) >>> at >> com.marklogic.xcc.impl.SessionImpl.insertContent(SessionImpl.java:204) >>> at org.xmlsh.marklogic.put.load(put.java:180) >>> at org.xmlsh.marklogic.put.load(put.java:171) >>> at org.xmlsh.marklogic.put.run(put.java:99) >>> at org.xmlsh.core.XCommand.run(XCommand.java:86) >>> at org.xmlsh.core.XCommand.run(XCommand.java:63) >>> at >> org.xmlsh.sh.core.SimpleCommand.exec(SimpleCommand.java:121) >>> at org.xmlsh.sh.shell.Shell.exec(Shell.java:560) >>> at org.xmlsh.sh.core.Pipeline.exec(Pipeline.java:124) >>> at org.xmlsh.sh.shell.Shell.exec(Shell.java:560) >>> at org.xmlsh.sh.shell.Shell.runScript(Shell.java:362) >>> at >> org.xmlsh.core.ScriptCommand.run(ScriptCommand.java:75) >>> at >> org.xmlsh.sh.core.SimpleCommand.exec(SimpleCommand.java:121) >>> at org.xmlsh.sh.shell.Shell.exec(Shell.java:560) >>> at org.xmlsh.sh.core.Pipeline.exec(Pipeline.java:124) >>> at org.xmlsh.sh.shell.Shell.exec(Shell.java:560) >>> at >> org.xmlsh.sh.shell.Shell.interactive(Shell.java:461) >>> at org.xmlsh.commands.builtin.xmlsh.run(xmlsh.java:82) >>> at >> org.xmlsh.core.BuiltinCommand.run(BuiltinCommand.java:54) >>> at org.xmlsh.sh.shell.Shell.main(Shell.java:690) >>> Caused by: java.io.IOException: An established connection was aborted >> by the software in your host machine >>> at sun.nio.ch.SocketDispatcher.write0(Native Method) >>> at >> sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:33) >>> at >> sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:104) >>> at sun.nio.ch.IOUtil.write(IOUtil.java:60) >>> at >> sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:334) >>> at >> com.marklogic.http.HttpChannel.writeBuffer(HttpChannel.java:373) >>> at >> com.marklogic.http.HttpChannel.writeBody(HttpChannel.java:353) >>> at >> com.marklogic.http.HttpChannel.flushRequest(HttpChannel.java:346) >>> at >> com.marklogic.http.HttpChannel.write(HttpChannel.java:134) >>> at >> com.marklogic.xcc.impl.handlers.ContentInsertController.writeChunkHead >> er >> (ContentInsertController.java:299) >>> at >> com.marklogic.xcc.impl.handlers.ContentInsertController.issueRequest(C >> on >> tentInsertController.java:210) >>> at >> com.marklogic.xcc.impl.handlers.ContentInsertController.serverDialog(C >> on >> tentInsertController.java:112) >>> at >> com.marklogic.xcc.impl.handlers.AbstractRequestController.runRequest(A >> bs >> tractRequestController.java:72) >>> ... 20 more >>> >>> >>> >>> >>> >>> From: general-boun...@developer.marklogic.com >> [mailto:general-boun...@developer.marklogic.com] On Behalf Of Sam Neth >>> Sent: Saturday, March 13, 2010 6:08 PM >>> To: General Mark Logic Developer Discussion >>> Subject: Re: [MarkLogic Dev General] ServerConnectionException >> -consistantly after about 20, 000 files >>> >>> Could you post a stack trace? >>> >>> What version of XCC are you using? >>> >>> What specifically are you referring to when you talk about "opening >> connections"? >>> >>> On Mar 13, 2010, at 2:33 PM, Lee, David wrote: >>> >>> >>> If I use XCC to iteratively insert a large set of documents I >> consistently get this error >>> >>> com.marklogic.xcc.exceptions.ServerConnectionException: An >>> established >> connectin was aborted by the software in your host machine [Session: >> user=DLEE, cb={default} [ContentSource: user=DLEE, cb={none} [providr: >> address=home/192.168.1.10:8011, pool=0/64]]] >>> >>> >>> This occurs after about 20,000 files and aborts the program. >>> I'm thinking of implementing a exception handler to retry but I dont >> want to be retrying after more serious errors. >>> The server log doesnt show any problems, and this is on a dedicated >> 1GB wired LAN so I dont think its internet problems. >>> >>> If instead of using the same connection I open the connection for >>> each >> file it often gets around this problem, but not always, >>> I think its getting around it because I'm not aborting on error in >> that case (just going to the next file). >>> >>> I'm using this code snippet to create the content in bulks of 1-20 ( >> files in a directory ) >>> >>> Content content= ContentFactory.newContent (uri, file, >> mCreateOptions); >>> contents.add(content); >>> ... >>> >>> if( ! contents.isEmpty() ) >>> session.insertContent (contents.toArray(new Content[ >> contents.size()])); >>> >>> >>> >>> Any suggestions ? >>> >>> >>> >>> ---------------------------------------- >>> David A. Lee >>> Senior Principal Software Engineer >>> Epocrates, Inc. >>> d...@epocrates.com >>> 812-482-5224 >>> >>> _______________________________________________ >>> General mailing list >>> General@developer.marklogic.com >>> http://xqzone.com/mailman/listinfo/general >>> >>> _______________________________________________ >>> General mailing list >>> General@developer.marklogic.com >>> http://xqzone.com/mailman/listinfo/general >> >> --- >> Ron Hitchens {mailto:r...@ronsoft.com} Ronsoft Technologies >> (650) 766-2355 (Home Office) http://www.ronsoft.com >> (707) 924-3878 (fax) Bit Twiddling At Its Finest >> "No amount of belief establishes any fact." -Unknown >> >> >> >> >> >> _______________________________________________ >> General mailing list >> General@developer.marklogic.com >> http://xqzone.com/mailman/listinfo/general >> _______________________________________________ >> General mailing list >> General@developer.marklogic.com >> http://xqzone.com/mailman/listinfo/general > > _______________________________________________ > General mailing list > General@developer.marklogic.com > http://xqzone.com/mailman/listinfo/general --- Ron Hitchens {mailto:r...@ronsoft.com} Ronsoft Technologies (650) 766-2355 (Home Office) http://www.ronsoft.com (707) 924-3878 (fax) Bit Twiddling At Its Finest "No amount of belief establishes any fact." -Unknown _______________________________________________ General mailing list General@developer.marklogic.com http://xqzone.com/mailman/listinfo/general