Great, thanks for the update. Karl
On Thu, Mar 8, 2018 at 10:47 AM, Mike Hugo <[email protected]> wrote: > Thanks for the ideas and the sanity check! Based on your feedback we've > been able to narrow down the problem to something in the custom output > connector. Seems we need to join the thread at the end. > > On Thu, Mar 8, 2018 at 9:37 AM, Karl Wright <[email protected]> wrote: > >> As a sanity check, I ran the postgresql RSS connector IT test on trunk >> and it passed: >> >> >>>>>> >> run-IT-postgresql: >> [junit] Testsuite: org.apache.manifoldcf.crawler. >> connectors.rss.tests.RSSSimpleCrawlPostgresqlIT >> [junit] Configuration file successfully read >> [junit] [main] INFO org.eclipse.jetty.util.log - Logging initialized >> @3336ms >> [junit] [main] INFO org.eclipse.jetty.server.Server - >> jetty-9.2.3.v20140905 >> [junit] [main] INFO org.eclipse.jetty.server.handler.ContextHandler >> - Started o.e.j.w.WebAppContext@4d1c005e{/mcf-crawler-ui,file:/C:/User >> s/kawright/AppData/Local/Temp/ >> jetty-0.0.0.0-8346-mcf-crawler-ui.war-_mcf-crawler-ui-any- >> 4871569714684839734.dir/webapp/,AVAILABLE}{C:\wip\mcf\ >> trunk\dist/web/war/mcf-crawler-ui.war} >> [junit] [main] INFO org.eclipse.jetty.server.handler.ContextHandler >> - Started o.e.j.w.WebAppContext@8462f31{/mcf-authority-service,file:/C >> :/Users/kawright/AppData/Local >> /Temp/jetty-0.0.0.0-8346-mcf-authority-service.war-_mcf-auth >> ority-service-any-8765187688005999492.dir/webapp/,AVAILABLE} >> {C:\wip\mcf\trunk\dist/web/war/mcf-authority-service >> .war} >> [junit] [main] INFO org.eclipse.jetty.server.handler.ContextHandler >> - Started o.e.j.w.WebAppContext@24569dba{/mcf-api-service,file:/C:/Use >> rs/kawright/AppData/Local/Temp >> /jetty-0.0.0.0-8346-mcf-api-service.war-_mcf-api-service-any >> -1263632524762735599.dir/webapp/,AVAILABLE}{C:\wip\mcf\trunk >> \dist/web/war/mcf-api-service.war} >> [junit] [main] INFO org.eclipse.jetty.server.ServerConnector - >> Started ServerConnector@1e1ff947{HTTP/1.1}{0.0.0.0:8346} >> [junit] [main] INFO org.eclipse.jetty.server.Server - Started @6277ms >> [junit] [main] INFO org.eclipse.jetty.server.Server - >> jetty-9.2.3.v20140905 >> [junit] [main] INFO org.eclipse.jetty.server.handler.ContextHandler >> - Started o.e.j.s.ServletContextHandler@7d286fb6{/rss,null,AVAILABLE} >> [junit] [main] INFO org.eclipse.jetty.server.ServerConnector - >> Started ServerConnector@3eb77ea8{HTTP/1.1}{0.0.0.0:8189} >> [junit] [main] INFO org.eclipse.jetty.server.Server - Started @6290ms >> [junit] Crawl required 90542 milliseconds >> [junit] [main] INFO org.eclipse.jetty.server.ServerConnector - >> Stopped ServerConnector@3eb77ea8{HTTP/1.1}{0.0.0.0:8189} >> [junit] [main] INFO org.eclipse.jetty.server.handler.ContextHandler >> - Stopped o.e.j.s.ServletContextHandler@7d286fb6{/rss,null,UNAVAILABLE} >> [junit] [main] INFO org.eclipse.jetty.server.ServerConnector - >> Stopped ServerConnector@1e1ff947{HTTP/1.1}{0.0.0.0:8346} >> [junit] [main] INFO org.eclipse.jetty.server.handler.ContextHandler >> - Stopped o.e.j.w.WebAppContext@24569dba{/mcf-api-service,file:/C:/Use >> rs/kawright/AppData/Local/Temp >> /jetty-0.0.0.0-8346-mcf-api-service.war-_mcf-api-service-any >> -1263632524762735599.dir/webapp/,UNAVAILABLE}{C:\wip\mcf\ >> trunk\dist/web/war/mcf-api-service.war} >> [junit] [main] INFO org.eclipse.jetty.server.handler.ContextHandler >> - Stopped o.e.j.w.WebAppContext@8462f31{/mcf-authority-service,file:/C >> :/Users/kawright/AppData/Local >> /Temp/jetty-0.0.0.0-8346-mcf-authority-service.war-_mcf-auth >> ority-service-any-8765187688005999492.dir/webapp/, >> UNAVAILABLE}{C:\wip\mcf\trunk\dist/web/war/mcf-authority-servi >> ce.war} >> [junit] [main] INFO org.eclipse.jetty.server.handler.ContextHandler >> - Stopped o.e.j.w.WebAppContext@4d1c005e{/mcf-crawler-ui,file:/C:/User >> s/kawright/AppData/Local/Temp/ >> jetty-0.0.0.0-8346-mcf-crawler-ui.war-_mcf-crawler-ui-any- >> 4871569714684839734.dir/webapp/,UNAVAILABLE}{C:\wip\ >> mcf\trunk\dist/web/war/mcf-crawler-ui.war} >> [junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time >> elapsed: 126.5 sec >> [junit] >> [junit] ------------- Standard Error ----------------- >> [junit] Configuration file successfully read >> [junit] [main] INFO org.eclipse.jetty.util.log - Logging initialized >> @3336ms >> [junit] [main] INFO org.eclipse.jetty.server.Server - >> jetty-9.2.3.v20140905 >> [junit] [main] INFO org.eclipse.jetty.server.handler.ContextHandler >> - Started o.e.j.w.WebAppContext@4d1c005e{/mcf-crawler-ui,file:/C:/User >> s/kawright/AppData/Local/Temp/ >> jetty-0.0.0.0-8346-mcf-crawler-ui.war-_mcf-crawler-ui-any- >> 4871569714684839734.dir/webapp/,AVAILABLE}{C:\wip\mcf\ >> trunk\dist/web/war/mcf-crawler-ui.war} >> [junit] [main] INFO org.eclipse.jetty.server.handler.ContextHandler >> - Started o.e.j.w.WebAppContext@8462f31{/mcf-authority-service,file:/C >> :/Users/kawright/AppData/Local >> /Temp/jetty-0.0.0.0-8346-mcf-authority-service.war-_mcf-auth >> ority-service-any-8765187688005999492.dir/webapp/,AVAILABLE} >> {C:\wip\mcf\trunk\dist/web/war/mcf-authority-service >> .war} >> [junit] [main] INFO org.eclipse.jetty.server.handler.ContextHandler >> - Started o.e.j.w.WebAppContext@24569dba{/mcf-api-service,file:/C:/Use >> rs/kawright/AppData/Local/Temp >> /jetty-0.0.0.0-8346-mcf-api-service.war-_mcf-api-service-any >> -1263632524762735599.dir/webapp/,AVAILABLE}{C:\wip\mcf\trunk >> \dist/web/war/mcf-api-service.war} >> [junit] [main] INFO org.eclipse.jetty.server.ServerConnector - >> Started ServerConnector@1e1ff947{HTTP/1.1}{0.0.0.0:8346} >> [junit] [main] INFO org.eclipse.jetty.server.Server - Started @6277ms >> [junit] [main] INFO org.eclipse.jetty.server.Server - >> jetty-9.2.3.v20140905 >> [junit] [main] INFO org.eclipse.jetty.server.handler.ContextHandler >> - Started o.e.j.s.ServletContextHandler@7d286fb6{/rss,null,AVAILABLE} >> [junit] [main] INFO org.eclipse.jetty.server.ServerConnector - >> Started ServerConnector@3eb77ea8{HTTP/1.1}{0.0.0.0:8189} >> [junit] [main] INFO org.eclipse.jetty.server.Server - Started @6290ms >> [junit] Crawl required 90542 milliseconds >> [junit] [main] INFO org.eclipse.jetty.server.ServerConnector - >> Stopped ServerConnector@3eb77ea8{HTTP/1.1}{0.0.0.0:8189} >> [junit] [main] INFO org.eclipse.jetty.server.handler.ContextHandler >> - Stopped o.e.j.s.ServletContextHandler@7d286fb6{/rss,null,UNAVAILABLE} >> [junit] [main] INFO org.eclipse.jetty.server.ServerConnector - >> Stopped ServerConnector@1e1ff947{HTTP/1.1}{0.0.0.0:8346} >> [junit] [main] INFO org.eclipse.jetty.server.handler.ContextHandler >> - Stopped o.e.j.w.WebAppContext@24569dba{/mcf-api-service,file:/C:/Use >> rs/kawright/AppData/Local/Temp >> /jetty-0.0.0.0-8346-mcf-api-service.war-_mcf-api-service-any >> -1263632524762735599.dir/webapp/,UNAVAILABLE}{C:\wip\mcf\ >> trunk\dist/web/war/mcf-api-service.war} >> [junit] [main] INFO org.eclipse.jetty.server.handler.ContextHandler >> - Stopped o.e.j.w.WebAppContext@8462f31{/mcf-authority-service,file:/C >> :/Users/kawright/AppData/Local >> /Temp/jetty-0.0.0.0-8346-mcf-authority-service.war-_mcf-auth >> ority-service-any-8765187688005999492.dir/webapp/, >> UNAVAILABLE}{C:\wip\mcf\trunk\dist/web/war/mcf-authority-servi >> ce.war} >> [junit] [main] INFO org.eclipse.jetty.server.handler.ContextHandler >> - Stopped o.e.j.w.WebAppContext@4d1c005e{/mcf-crawler-ui,file:/C:/User >> s/kawright/AppData/Local/Temp/ >> jetty-0.0.0.0-8346-mcf-crawler-ui.war-_mcf-crawler-ui-any- >> 4871569714684839734.dir/webapp/,UNAVAILABLE}{C:\wip\ >> mcf\trunk\dist/web/war/mcf-crawler-ui.war} >> [junit] ------------- ---------------- --------------- >> >> BUILD SUCCESSFUL >> Total time: 2 minutes 8 seconds >> <<<<<< >> >> This is running against my installed laptop version of Postgresql on >> Windows (version 9.3), with the shipping Postgresql JDBC driver 42.1.3. >> The test is a simple crawl against a locally-written RSS service. >> >> >> Karl >> >> >> On Thu, Mar 8, 2018 at 9:54 AM, Karl Wright <[email protected]> wrote: >> >>> I've reviewed all changes to the RSS connector and to the framework over >>> the last year, and none of them could reasonably have been expected to have >>> any kind of effect like this. The only things changed were the redirect >>> strategy and updating to the latest Postgresql JDBC driver. >>> >>> If the problem doesn't occur in the single-process example, the next >>> question is: do you have a multiprocess setup? If so, try the multiprocess >>> example and see if that succeeds. If it does, the problem is how we work >>> with Postgresql. >>> >>> Karl >>> >>> >>> On Thu, Mar 8, 2018 at 9:41 AM, Karl Wright <[email protected]> wrote: >>> >>>> Hi Mike, >>>> >>>> You are the third person this morning that has reported this in >>>> conjunction with Postgresql. It is possible that some behavior we count on >>>> broke in the latest postgresql release. Can you tell me what version you >>>> are using? Do you see the same behavior when you run with the built-in >>>> HSQLDB example? >>>> >>>> Karl >>>> >>>> >>>> On Thu, Mar 8, 2018 at 9:32 AM, Mike Hugo <[email protected]> wrote: >>>> >>>>> Hello, >>>>> >>>>> I set up a new manifold instance based on the simple example. I >>>>> modified properties.xml to point to a postgresql database and then set it >>>>> up to read an RSS feed. It uses a custom output connector to send the >>>>> data >>>>> to a custom API. >>>>> >>>>> I've noticed that it starts properly, but it only pulls in 3 or 4 >>>>> records before it "hangs" and doesn't pull in more docs after that. If I >>>>> bounce the server then it will pull in 3 or 4 more docs, but then seems to >>>>> hang again. >>>>> >>>>> I can add a new RSS feed and start it, but it won't pull in any >>>>> documents until the server is bounced. >>>>> >>>>> I increased the value of org.apache.manifoldcf.crawler.threads and >>>>> that seems to help, but it just delays the same behavior. For example, it >>>>> might pull in 10 or 15 docs, but then stops pulling them in again. No >>>>> messages in the logs. >>>>> >>>>> It does appear that it's spawning many many of these threads: >>>>> ExecuteQueryThread >>>>> >>>>> Any ideas where to start looking or how to debug why it hangs after >>>>> only a few documents? >>>>> >>>>> Thanks!! >>>>> >>>>> Mike >>>>> >>>> >>>> >>> >> >
