[jira] Created: (CONNECTORS-162) To support connectors that have their own class loaders, we need to provide a way of creating derivative resource loaders
To support connectors that have their own class loaders, we need to provide a way of creating derivative resource loaders - Key: CONNECTORS-162 URL: https://issues.apache.org/jira/browse/CONNECTORS-162 Project: ManifoldCF Issue Type: Improvement Components: Framework core Affects Versions: ManifoldCF next Reporter: Karl Wright Assignee: Karl Wright ManifoldCF currently creates its own resource loader, which adds jars to the class path based on properties.xml parameters. Connectors may need to create derivative resource loaders, to isolate conflicting jars, and we should make this easy. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Resolved: (CONNECTORS-162) To support connectors that have their own class loaders, we need to provide a way of creating derivative resource loaders
[ https://issues.apache.org/jira/browse/CONNECTORS-162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-162. Resolution: Fixed Fix Version/s: ManifoldCF next r1072823. To support connectors that have their own class loaders, we need to provide a way of creating derivative resource loaders - Key: CONNECTORS-162 URL: https://issues.apache.org/jira/browse/CONNECTORS-162 Project: ManifoldCF Issue Type: Improvement Components: Framework core Affects Versions: ManifoldCF next Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF next ManifoldCF currently creates its own resource loader, which adds jars to the class path based on properties.xml parameters. Connectors may need to create derivative resource loaders, to isolate conflicting jars, and we should make this easy. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Created: (CONNECTORS-163) Go to current version of Derby, to try and avoid internal deadlocks
Go to current version of Derby, to try and avoid internal deadlocks --- Key: CONNECTORS-163 URL: https://issues.apache.org/jira/browse/CONNECTORS-163 Project: ManifoldCF Issue Type: Improvement Components: Framework core Affects Versions: ManifoldCF next Reporter: Karl Wright Derby 10.5.3.0 internally deadlocks on the straightforward correlated subqueries involving the carrydown table. The source of the problem is not clear. However, there's a newer version of Derby available. If it passes the tests, I recommend trying that to see if the problem is fixed. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Created: (CONNECTORS-165) Upgrade to jetty 6.1.26, with patches, for latest unicode support etc.
Upgrade to jetty 6.1.26, with patches, for latest unicode support etc. -- Key: CONNECTORS-165 URL: https://issues.apache.org/jira/browse/CONNECTORS-165 Project: ManifoldCF Issue Type: Improvement Components: Framework core Affects Versions: ManifoldCF next Reporter: Karl Wright Assignee: Karl Wright Priority: Minor The version of jetty we're using is pretty old. Solr is upgrading to 6.1.26, plus some patches. We should probably do the same. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Resolved: (CONNECTORS-165) Upgrade to jetty 6.1.26, with patches, for latest unicode support etc.
[ https://issues.apache.org/jira/browse/CONNECTORS-165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-165. Resolution: Fixed Fix Version/s: ManifoldCF next r1074713. Upgrade to jetty 6.1.26, with patches, for latest unicode support etc. -- Key: CONNECTORS-165 URL: https://issues.apache.org/jira/browse/CONNECTORS-165 Project: ManifoldCF Issue Type: Improvement Components: Framework core Affects Versions: ManifoldCF next Reporter: Karl Wright Assignee: Karl Wright Priority: Minor Fix For: ManifoldCF next The version of jetty we're using is pretty old. Solr is upgrading to 6.1.26, plus some patches. We should probably do the same. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Created: (CONNECTORS-166) Crawl seizes up when running Derby
Crawl seizes up when running Derby -- Key: CONNECTORS-166 URL: https://issues.apache.org/jira/browse/CONNECTORS-166 Project: ManifoldCF Issue Type: Bug Components: Framework crawler agent Affects Versions: ManifoldCF 0.1, ManifoldCF next Reporter: Karl Wright Assignee: Karl Wright A crawl using multiple worker threads with Derby eventually hangs, because threads get deadlocked dealing with carrydown information. At the time of hang, a thread dump yields: Worker thread '5' daemon prio=6 tid=0x02fc7800 nid=0xd78 in Object.wait() [0x0465f000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on 0x2858b720 (a org.apache.manifoldcf.core.database.Database$ExecuteQueryThread) at java.lang.Thread.join(Unknown Source) - locked 0x2858b720 (a org.apache.manifoldcf.core.database.Database$ExecuteQueryThread) at java.lang.Thread.join(Unknown Source) at org.apache.manifoldcf.core.database.Database.executeViaThread(Database.java:453) at org.apache.manifoldcf.core.database.Database.executeUncachedQuery(Database.java:489) at org.apache.manifoldcf.core.database.Database$QueryCacheExecutor.create(Database.java:1131) at org.apache.manifoldcf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:144) at org.apache.manifoldcf.core.database.Database.executeQuery(Database.java:168) at org.apache.manifoldcf.core.database.DBInterfaceDerby.performQuery(DBInterfaceDerby.java:785) at org.apache.manifoldcf.crawler.jobs.JobManager.processDeleteHashSet(JobManager.java:2592) at org.apache.manifoldcf.crawler.jobs.JobManager.calculateAffectedDeleteCarrydownChildren(JobManager.java:2565) at org.apache.manifoldcf.crawler.jobs.JobManager.markDocumentDeletedMultiple(JobManager.java:2494) at org.apache.manifoldcf.crawler.system.WorkerThread.processDeleteLists(WorkerThread.java:1077) at org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:544) ... for at least two threads. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Assigned: (CONNECTORS-159) Support for external PostgreSQL server
[ https://issues.apache.org/jira/browse/CONNECTORS-159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright reassigned CONNECTORS-159: -- Assignee: Karl Wright Support for external PostgreSQL server -- Key: CONNECTORS-159 URL: https://issues.apache.org/jira/browse/CONNECTORS-159 Project: ManifoldCF Issue Type: Improvement Affects Versions: ManifoldCF 0.1 Reporter: Erlend Garåsen Assignee: Karl Wright Fix For: ManifoldCF 0.2 Attachments: CONNECTORS-159.patch It should be possible to configure an external PostgreSQL server and optionally configure ManifoldCF to communicate with PostgreSQL by using SSL. I suggest that the two following properties are added: org.apache.manifoldcf.postgresql.hostname (not required, defaults to localhost) org.apache.manifoldcf.postgresql.ssl (not required, defaults to false) -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (CONNECTORS-159) Support for external PostgreSQL server
[ https://issues.apache.org/jira/browse/CONNECTORS-159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-159: --- Resolution: Fixed Status: Resolved (was: Patch Available) r1078003, slightly modified as stated. Support for external PostgreSQL server -- Key: CONNECTORS-159 URL: https://issues.apache.org/jira/browse/CONNECTORS-159 Project: ManifoldCF Issue Type: Improvement Affects Versions: ManifoldCF 0.1 Reporter: Erlend Garåsen Assignee: Karl Wright Fix For: ManifoldCF 0.2 Attachments: CONNECTORS-159.patch It should be possible to configure an external PostgreSQL server and optionally configure ManifoldCF to communicate with PostgreSQL by using SSL. I suggest that the two following properties are added: org.apache.manifoldcf.postgresql.hostname (not required, defaults to localhost) org.apache.manifoldcf.postgresql.ssl (not required, defaults to false) -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (CONNECTORS-166) Crawl seizes up when running Derby
[ https://issues.apache.org/jira/browse/CONNECTORS-166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13006583#comment-13006583 ] Karl Wright commented on CONNECTORS-166: According to the Derby team, Derby trunk fixes this problem. I've therefore build trunk and checked it in. r1081520. Crawl seizes up when running Derby -- Key: CONNECTORS-166 URL: https://issues.apache.org/jira/browse/CONNECTORS-166 Project: ManifoldCF Issue Type: Bug Components: Framework crawler agent Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2 Reporter: Karl Wright Assignee: Karl Wright A crawl using multiple worker threads with Derby eventually hangs, because threads get deadlocked dealing with carrydown information. At the time of hang, a thread dump yields: Worker thread '5' daemon prio=6 tid=0x02fc7800 nid=0xd78 in Object.wait() [0x0465f000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on 0x2858b720 (a org.apache.manifoldcf.core.database.Database$ExecuteQueryThread) at java.lang.Thread.join(Unknown Source) - locked 0x2858b720 (a org.apache.manifoldcf.core.database.Database$ExecuteQueryThread) at java.lang.Thread.join(Unknown Source) at org.apache.manifoldcf.core.database.Database.executeViaThread(Database.java:453) at org.apache.manifoldcf.core.database.Database.executeUncachedQuery(Database.java:489) at org.apache.manifoldcf.core.database.Database$QueryCacheExecutor.create(Database.java:1131) at org.apache.manifoldcf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:144) at org.apache.manifoldcf.core.database.Database.executeQuery(Database.java:168) at org.apache.manifoldcf.core.database.DBInterfaceDerby.performQuery(DBInterfaceDerby.java:785) at org.apache.manifoldcf.crawler.jobs.JobManager.processDeleteHashSet(JobManager.java:2592) at org.apache.manifoldcf.crawler.jobs.JobManager.calculateAffectedDeleteCarrydownChildren(JobManager.java:2565) at org.apache.manifoldcf.crawler.jobs.JobManager.markDocumentDeletedMultiple(JobManager.java:2494) at org.apache.manifoldcf.crawler.system.WorkerThread.processDeleteLists(WorkerThread.java:1077) at org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:544) ... for at least two threads. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (CONNECTORS-168) Solr connector does not handle 500 errors well; it assumes they have an XML response
[ https://issues.apache.org/jira/browse/CONNECTORS-168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-168: --- Fix Version/s: ManifoldCF 0.2 Status: Patch Available (was: Open) This patch should allow the raw stack trace and error to be returned if the response is not XML. Solr connector does not handle 500 errors well; it assumes they have an XML response Key: CONNECTORS-168 URL: https://issues.apache.org/jira/browse/CONNECTORS-168 Project: ManifoldCF Issue Type: Bug Components: Lucene/SOLR connector Affects Versions: ManifoldCF 0.2 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 0.2 The Solr Connector presumes that the response from Solr is XML. Unfortunately, in some cases it isn't, such as when there's a 500 error and a stack trace. We should make this work too, without dumping a trace to standard out. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (CONNECTORS-168) Solr connector does not handle 500 errors well; it assumes they have an XML response
[ https://issues.apache.org/jira/browse/CONNECTORS-168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-168: --- Comment: was deleted (was: This patch should allow the raw stack trace and error to be returned if the response is not XML. ) Solr connector does not handle 500 errors well; it assumes they have an XML response Key: CONNECTORS-168 URL: https://issues.apache.org/jira/browse/CONNECTORS-168 Project: ManifoldCF Issue Type: Bug Components: Lucene/SOLR connector Affects Versions: ManifoldCF 0.2 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 0.2 Attachments: CONNECTORS-168.patch The Solr Connector presumes that the response from Solr is XML. Unfortunately, in some cases it isn't, such as when there's a 500 error and a stack trace. We should make this work too, without dumping a trace to standard out. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (CONNECTORS-168) Solr connector does not handle 500 errors well; it assumes they have an XML response
[ https://issues.apache.org/jira/browse/CONNECTORS-168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-168: --- Status: Open (was: Patch Available) Solr connector does not handle 500 errors well; it assumes they have an XML response Key: CONNECTORS-168 URL: https://issues.apache.org/jira/browse/CONNECTORS-168 Project: ManifoldCF Issue Type: Bug Components: Lucene/SOLR connector Affects Versions: ManifoldCF 0.2 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 0.2 Attachments: CONNECTORS-168.patch The Solr Connector presumes that the response from Solr is XML. Unfortunately, in some cases it isn't, such as when there's a 500 error and a stack trace. We should make this work too, without dumping a trace to standard out. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Resolved: (CONNECTORS-168) Solr connector does not handle 500 errors well; it assumes they have an XML response
[ https://issues.apache.org/jira/browse/CONNECTORS-168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-168. Resolution: Fixed r1081809. Solr connector does not handle 500 errors well; it assumes they have an XML response Key: CONNECTORS-168 URL: https://issues.apache.org/jira/browse/CONNECTORS-168 Project: ManifoldCF Issue Type: Bug Components: Lucene/SOLR connector Affects Versions: ManifoldCF 0.2 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 0.2 Attachments: CONNECTORS-168.patch The Solr Connector presumes that the response from Solr is XML. Unfortunately, in some cases it isn't, such as when there's a 500 error and a stack trace. We should make this work too, without dumping a trace to standard out. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (CONNECTORS-100) DB lock timeout, and/or indefinite or excessive database activity
[ https://issues.apache.org/jira/browse/CONNECTORS-100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13007773#comment-13007773 ] Karl Wright commented on CONNECTORS-100: Confirmed. It's not an actual deadlock; it recovers after an extended period of time. But it's not clear what query is slow, although this particular one seems to be a possibility: UPDATE hopcount SET deathmark='D',distance=-1 WHERE id IN(SELECT ownerid FROM hopdeletedeps t0 WHERE ((t0.jobid=1300315252437 AND t0.childidhash='0867FAD4FB2B46E04F2AFA9A1200D63266D48089')) AND EXISTS(SELECT 'x' FROM intrinsiclink t1 WHERE t1.linktype=t0.linktype AND t1.jobid=t0.jobid AND t1.parentidhash=t0.parentidhash AND t1.childidhash=t0.childidhash)) Derby's diagnostics for query plan output seem inadequate to assess this in real time, unfortunately. DB lock timeout, and/or indefinite or excessive database activity - Key: CONNECTORS-100 URL: https://issues.apache.org/jira/browse/CONNECTORS-100 Project: ManifoldCF Issue Type: Bug Components: Framework core Environment: Running unmodified dist/example from trunk/ using the default configuration. Reporter: Andrzej Bialecki Assignee: Karl Wright When a job is started and running (via crawler-ui) occasionally it's not possible to display a list of running jobs. The problem persists even after restarting ACF. The following exception is thrown in the console: {code} org.apache.acf.core.interfaces.ACFException: Database exception: Exception doing query: A lock could not be obtained within the time requested at org.apache.acf.core.database.Database.executeViaThread(Database.java:421) at org.apache.acf.core.database.Database.executeUncachedQuery(Database.java:465) at org.apache.acf.core.database.Database$QueryCacheExecutor.create(Database.java:1072) at org.apache.acf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:144) at org.apache.acf.core.database.Database.executeQuery(Database.java:167) at org.apache.acf.core.database.DBInterfaceDerby.performQuery(DBInterfaceDerby.java:727) at org.apache.acf.crawler.jobs.JobManager.makeJobStatus(JobManager.java:5611) at org.apache.acf.crawler.jobs.JobManager.getAllStatus(JobManager.java:5549) at org.apache.jsp.showjobstatus_jsp._jspService(showjobstatus_jsp.java:316) at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:70) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:377) at org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:313) at org.apache.jasper.servlet.JspServlet.service(JspServlet.java:260) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:390) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:765) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:418) at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) at org.mortbay.jetty.Server.handle(Server.java:326) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:923) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:547) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) Caused by: java.sql.SQLTransactionRollbackException: A lock could not be obtained within the time requested at org.apache.derby.impl.jdbc.SQLExceptionFactory40.getSQLException(Unknown Source) at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Unknown Source) at org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown Source) at org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(Unknown Source
[jira] Created: (CONNECTORS-169) Maximum OR clause constant not present in database abstraction
Maximum OR clause constant not present in database abstraction -- Key: CONNECTORS-169 URL: https://issues.apache.org/jira/browse/CONNECTORS-169 Project: ManifoldCF Issue Type: Bug Components: Framework core Affects Versions: ManifoldCF 0.2 Reporter: Karl Wright Assignee: Karl Wright There's a maximum IN clause constant in IDBInterface, but the maximum OR clause number is hardwired, and is strewn throughout the code. That makes it impossible to address (for example) the recently detected Derby bug underlying CONNECTORS-100. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Created: (CONNECTORS-170) Derby database driver needs to periodically update statistics
Derby database driver needs to periodically update statistics - Key: CONNECTORS-170 URL: https://issues.apache.org/jira/browse/CONNECTORS-170 Project: ManifoldCF Issue Type: Bug Components: Framework core Affects Versions: ManifoldCF 0.2 Reporter: Karl Wright The Derby database driver needs to update statistics periodically, using logic similar to that developed for PostgreSQL. The way that's done is through calling SYSCS_UTIL.SYSCS_UPDATE_STATISTICS on the table in question. http://db.apache.org/derby/docs/10.7/ref/rrefupdatestatsproc.html. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Assigned: (CONNECTORS-170) Derby database driver needs to periodically update statistics
[ https://issues.apache.org/jira/browse/CONNECTORS-170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright reassigned CONNECTORS-170: -- Assignee: Karl Wright Derby database driver needs to periodically update statistics - Key: CONNECTORS-170 URL: https://issues.apache.org/jira/browse/CONNECTORS-170 Project: ManifoldCF Issue Type: Bug Components: Framework core Affects Versions: ManifoldCF 0.2 Reporter: Karl Wright Assignee: Karl Wright The Derby database driver needs to update statistics periodically, using logic similar to that developed for PostgreSQL. The way that's done is through calling SYSCS_UTIL.SYSCS_UPDATE_STATISTICS on the table in question. http://db.apache.org/derby/docs/10.7/ref/rrefupdatestatsproc.html. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Resolved: (CONNECTORS-169) Maximum OR clause constant not present in database abstraction
[ https://issues.apache.org/jira/browse/CONNECTORS-169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-169. Resolution: Fixed Fix Version/s: ManifoldCF 0.2 r1082484. Maximum OR clause constant not present in database abstraction -- Key: CONNECTORS-169 URL: https://issues.apache.org/jira/browse/CONNECTORS-169 Project: ManifoldCF Issue Type: Bug Components: Framework core Affects Versions: ManifoldCF 0.2 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 0.2 There's a maximum IN clause constant in IDBInterface, but the maximum OR clause number is hardwired, and is strewn throughout the code. That makes it impossible to address (for example) the recently detected Derby bug underlying CONNECTORS-100. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Resolved: (CONNECTORS-170) Derby database driver needs to periodically update statistics
[ https://issues.apache.org/jira/browse/CONNECTORS-170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-170. Resolution: Fixed Fix Version/s: ManifoldCF 0.2 r1082598. Derby database driver needs to periodically update statistics - Key: CONNECTORS-170 URL: https://issues.apache.org/jira/browse/CONNECTORS-170 Project: ManifoldCF Issue Type: Bug Components: Framework core Affects Versions: ManifoldCF 0.2 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 0.2 The Derby database driver needs to update statistics periodically, using logic similar to that developed for PostgreSQL. The way that's done is through calling SYSCS_UTIL.SYSCS_UPDATE_STATISTICS on the table in question. http://db.apache.org/derby/docs/10.7/ref/rrefupdatestatsproc.html. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CONNECTORS-172) Intermittent test failures
Intermittent test failures -- Key: CONNECTORS-172 URL: https://issues.apache.org/jira/browse/CONNECTORS-172 Project: ManifoldCF Issue Type: Bug Components: Tests Affects Versions: ManifoldCF 0.2 Reporter: Karl Wright Priority: Minor The Derby filesystem end-to-end tests sometimes randomly fail with a Database error: No existing connection error during a job status wait. Not sure what's happening here, but they succeed much of the time. There's not much of a hint beyond the stack trace. The message seems to be coming from Derby, and may be the result of a too-short wait time limit, a race condition in the test itself, or something else entirely. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-171) RSS Connector dechromed mode can corrupt data prior to indexing
[ https://issues.apache.org/jira/browse/CONNECTORS-171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-171: --- Resolution: Fixed Fix Version/s: ManifoldCF next Assignee: Karl Wright Status: Resolved (was: Patch Available) r1085506. RSS Connector dechromed mode can corrupt data prior to indexing --- Key: CONNECTORS-171 URL: https://issues.apache.org/jira/browse/CONNECTORS-171 Project: ManifoldCF Issue Type: Bug Components: Framework core, RSS connector Affects Versions: ManifoldCF 0.2 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF next Attachments: XML.java, patch.txt When the contents of the description or content fields of the feed contains entity references, the content may be turned into invalid XML before being handed to the index. The cause of this is decoding of the entity references by the XML parser that parses the feed, while the output of text in-between tags is not re-encoded. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-173) Table Entity on javadoc
[ https://issues.apache.org/jira/browse/CONNECTORS-173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13013910#comment-13013910 ] Karl Wright commented on CONNECTORS-173: Offhand, these look good. I'll have to see what the generated javadoc HTML looks like though. Table Entity on javadoc --- Key: CONNECTORS-173 URL: https://issues.apache.org/jira/browse/CONNECTORS-173 Project: ManifoldCF Issue Type: Improvement Components: Documentation Affects Versions: ManifoldCF next Reporter: Shinichiro Abe Priority: Minor Labels: documentation Fix For: ManifoldCF next Attachments: CONNECTORS-173-agents.patch, CONNECTORS-173-authorities.patch, CONNECTORS-173-crawler.patch, CONNECTORS-173-webcrawler.patch, MCF_Tables.xls, example1.png, example2.png Original Estimate: 24h Remaining Estimate: 24h Proposal: MCF manages about 20 tables. I want to check the database management through seeing tables, but now there is almost no explanation in MCF documents. So, I think javadoc can explain this, such as example description below. It can help users know the relation on manager class and table, and the relationship between tables, I think. May I add the javadoc code for each manager classes? Related tables that will modify are in this attachment. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-173) Table Entity on javadoc
[ https://issues.apache.org/jira/browse/CONNECTORS-173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13013964#comment-13013964 ] Karl Wright commented on CONNECTORS-173: I'll be pushing the newer javadocs out to the web site this evening. Table Entity on javadoc --- Key: CONNECTORS-173 URL: https://issues.apache.org/jira/browse/CONNECTORS-173 Project: ManifoldCF Issue Type: Improvement Components: Documentation Affects Versions: ManifoldCF next Reporter: Shinichiro Abe Assignee: Karl Wright Priority: Minor Labels: documentation Fix For: ManifoldCF next Attachments: CONNECTORS-173-agents.patch, CONNECTORS-173-authorities.patch, CONNECTORS-173-crawler.patch, CONNECTORS-173-webcrawler.patch, MCF_Tables.xls, example1.png, example2.png Original Estimate: 24h Remaining Estimate: 24h Proposal: MCF manages about 20 tables. I want to check the database management through seeing tables, but now there is almost no explanation in MCF documents. So, I think javadoc can explain this, such as example description below. It can help users know the relation on manager class and table, and the relationship between tables, I think. May I add the javadoc code for each manager classes? Related tables that will modify are in this attachment. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-172) Intermittent test failures
[ https://issues.apache.org/jira/browse/CONNECTORS-172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13014098#comment-13014098 ] Karl Wright commented on CONNECTORS-172: Seems to be a Derby problem. DERBY-5169 created. Intermittent test failures -- Key: CONNECTORS-172 URL: https://issues.apache.org/jira/browse/CONNECTORS-172 Project: ManifoldCF Issue Type: Bug Components: Tests Affects Versions: ManifoldCF 0.2 Reporter: Karl Wright Priority: Minor The Derby filesystem end-to-end tests sometimes randomly fail with a Database error: No existing connection error during a job status wait. Not sure what's happening here, but they succeed much of the time. There's not much of a hint beyond the stack trace. The message seems to be coming from Derby, and may be the result of a too-short wait time limit, a race condition in the test itself, or something else entirely. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-172) Intermittent test failures
[ https://issues.apache.org/jira/browse/CONNECTORS-172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-172: --- Priority: Blocker (was: Minor) Intermittent test failures -- Key: CONNECTORS-172 URL: https://issues.apache.org/jira/browse/CONNECTORS-172 Project: ManifoldCF Issue Type: Bug Components: Tests Affects Versions: ManifoldCF 0.2 Reporter: Karl Wright Priority: Blocker The Derby filesystem end-to-end tests sometimes randomly fail with a Database error: No existing connection error during a job status wait. Not sure what's happening here, but they succeed much of the time. There's not much of a hint beyond the stack trace. The message seems to be coming from Derby, and may be the result of a too-short wait time limit, a race condition in the test itself, or something else entirely. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CONNECTORS-172) Intermittent test failures
[ https://issues.apache.org/jira/browse/CONNECTORS-172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-172. Resolution: Fixed Fix Version/s: ManifoldCF next ManifoldCF 0.2 Assignee: Karl Wright r1087607 r1087610 Intermittent test failures -- Key: CONNECTORS-172 URL: https://issues.apache.org/jira/browse/CONNECTORS-172 Project: ManifoldCF Issue Type: Bug Components: Tests Affects Versions: ManifoldCF 0.2 Reporter: Karl Wright Assignee: Karl Wright Priority: Blocker Fix For: ManifoldCF 0.2, ManifoldCF next The Derby filesystem end-to-end tests sometimes randomly fail with a Database error: No existing connection error during a job status wait. Not sure what's happening here, but they succeed much of the time. There's not much of a hint beyond the stack trace. The message seems to be coming from Derby, and may be the result of a too-short wait time limit, a race condition in the test itself, or something else entirely. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CONNECTORS-175) The site documentation property list does not include the PostgreSQL-specific parameters, and may be missing some of the Derby ones too
The site documentation property list does not include the PostgreSQL-specific parameters, and may be missing some of the Derby ones too --- Key: CONNECTORS-175 URL: https://issues.apache.org/jira/browse/CONNECTORS-175 Project: ManifoldCF Issue Type: Improvement Components: Documentation Affects Versions: ManifoldCF next Reporter: Karl Wright Priority: Minor The table that documents all the properties in properties.xml seems to be missing the PostgreSQL-specific ones. This is the how-to-build-and-deploy.html page. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CONNECTORS-176) It might be nice to have a direct link from the site menu bar to the performance tuning page
It might be nice to have a direct link from the site menu bar to the performance tuning page Key: CONNECTORS-176 URL: https://issues.apache.org/jira/browse/CONNECTORS-176 Project: ManifoldCF Issue Type: Improvement Components: Documentation Affects Versions: ManifoldCF next Reporter: Karl Wright Priority: Minor The only way to get to the performance tuning page now is through the developer support page. A direct link might make it easier. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-176) It might be nice to have a direct link from the site navigation area to the performance tuning page
[ https://issues.apache.org/jira/browse/CONNECTORS-176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-176: --- Summary: It might be nice to have a direct link from the site navigation area to the performance tuning page (was: It might be nice to have a direct link from the site menu bar to the performance tuning page) It might be nice to have a direct link from the site navigation area to the performance tuning page --- Key: CONNECTORS-176 URL: https://issues.apache.org/jira/browse/CONNECTORS-176 Project: ManifoldCF Issue Type: Improvement Components: Documentation Affects Versions: ManifoldCF next Reporter: Karl Wright Priority: Minor The only way to get to the performance tuning page now is through the developer support page. A direct link might make it easier. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-92) Move from ant to maven or other build system with decent library management
[ https://issues.apache.org/jira/browse/CONNECTORS-92?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13015008#comment-13015008 ] Karl Wright commented on CONNECTORS-92: --- This issue has stalled. The work done so far has created a reasonable directory structure with a pom.xml everywhere it needs to be. However, beyond that, nothing more has happened. It might be worth researching ant with ivy at this point, since that would be a natural extension of the current built system, rather than trying to go the maven route entirely. Move from ant to maven or other build system with decent library management --- Key: CONNECTORS-92 URL: https://issues.apache.org/jira/browse/CONNECTORS-92 Project: ManifoldCF Issue Type: Wish Components: Build Reporter: Jettro Coenradie Assignee: Karl Wright Attachments: Screen shot 2010-08-23 at 16.31.07.png, maven-poms-including-start-jar.patch, maven-poms-problem-starting-jetty-and-derby.patch, maven-start-jar.patch, move-to-maven-acf-framework.patch, patch-connectors.zip I am looking at the current project structure. If we want to make another build tool available I think we need to change the directory structure. I tried to place a suggestion in an image. Can you please have a look at it. If we agree that this is a good way to go, than I will continue to work on a patch. Which might be a bit hard with all these changing directories, but I'll do my best to at least get an idea whether it would be working. So I have three questions: - Do you want to move to maven or put maven next to ant? - Do you prefer another build mechanism [ant with ivy, gradle, maven3] - Do you have an idea about the amount of scripts that need to be changed if we change the project structure The image of a possible project layout (that is based on the maven standards) is attached to the issue -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-58) ManifoldCF scripting language and some example scripts, executed via the API, plus example jobs for file system and web crawl
[ https://issues.apache.org/jira/browse/CONNECTORS-58?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-58: -- Summary: ManifoldCF scripting language and some example scripts, executed via the API, plus example jobs for file system and web crawl (was: Mini-API to initially configure default connections and example jobs for file system and web crawl ) ManifoldCF scripting language and some example scripts, executed via the API, plus example jobs for file system and web crawl Key: CONNECTORS-58 URL: https://issues.apache.org/jira/browse/CONNECTORS-58 Project: ManifoldCF Issue Type: Sub-task Components: Examples Reporter: Jack Krupansky Priority: Minor Creating a basic connection setup to do a relatively simple crawl for a file system or web can be a daunting task for someone new to LCF. So, it would be nice to have a scripting file that supports an abbreviated API (subset of the full API discussed in CONNECTORS-56) sufficient to create a default set of connections and example jobs that the new user can choose from. Beyond this initial need, this script format might be a useful form to dump all of the connections and jobs in the LCF database in a form that can be used to recreate an LCF configuration. Kind of a dump and reload capability. That in fact might be how the initial example script gets created. Those are two distinct use cases, but could utilize the same feature. The example script could have example jobs to crawl a subdirectory of LCF, crawl the LCF wiki, etc. There could be more than one script. There might be example scripts for each form of connector. This capability should be available for both QuickStart and the general release of LCF. As just one possibility, the script format might be a sequence of JSON expressions, each with an initial string analogous to a servlet path to specify the operation to be performed, followed by the JSON form of the connection or job or other LCF object. Or, some other format might be more suitable. Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-110) Max activity and Max bandwidth reports don't work properly under Derby
[ https://issues.apache.org/jira/browse/CONNECTORS-110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-110: --- Summary: Max activity and Max bandwidth reports don't work properly under Derby (was: Max activity and Max bandwidth reports fail under Derby with a stack trace) Max activity and Max bandwidth reports don't work properly under Derby -- Key: CONNECTORS-110 URL: https://issues.apache.org/jira/browse/CONNECTORS-110 Project: ManifoldCF Issue Type: Bug Components: Framework crawler agent Reporter: Karl Wright The reason for the failure is because the queries used are doing the Postgresql DISTINCT ON (xxx) syntax, which Derby does not support. Unfortunately, there does not seem to be a way in Derby at present to do anything similar to DISTINCT ON (xxx), and the queries really can't be done without that. One option is to introduce a getCapabilities() method into the database implementation, which would allow ACF to query the database capabilities before even presenting the report in the navigation menu in the UI. Another alternative is to do a sizable chunk of resultset processing within ACF, which would require not only the DISTINCT ON() implementation, but also the enclosing sort and limit stuff. It's the latter that would be most challenging, because of the difficulties with i18n etc. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-55) Bundle database server with ManifoldCF packaged product
[ https://issues.apache.org/jira/browse/CONNECTORS-55?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-55: -- Summary: Bundle database server with ManifoldCF packaged product (was: Bundle database server with LCF packaged product) Bundle database server with ManifoldCF packaged product --- Key: CONNECTORS-55 URL: https://issues.apache.org/jira/browse/CONNECTORS-55 Project: ManifoldCF Issue Type: Sub-task Components: Installers Reporter: Jack Krupansky The current requirement that the user install and deploy a PostgreSQL server complicates the installation and deployment of LCF for the user. Installation and deployment of LCF should be as simple as Solr itself. QuickStart is great for the low-end and basic evaluation, but a comparable level of simplified installation and deployment is still needed for full-blown, high-end environments that need the full performance of a ProstgreSQL-class database server. So, PostgreSQL should be bundled with the packaged release of LCF so that installation and deployment of LCF will automatically install and deploy a subset of the full PostgreSQL distribution that is sufficient for the needs of LCF. Starting LCF, with or without the LCF UI, should automatically start the database server. Shutting down LCF should also shutdown the database server process. A typical use case would be for a non-developer who is comfortable with Solr and simply wants to crawl documents from, for example, a SharePoint repository and feed them into Solr. QuickStart should work well for the low end or in the early stages of evaluation, but the user would prefer to evaluate the real thing with something resembling a production crawl of thousands of documents. Such a user might not be a hard-core developer or be comfortable fiddling with a lot of software components simply to do one conceptually simple operation. It should still be possible for the user to supply database server settings to override the defaults, but the LCF package should have all of the best-practice settings deemed appropriate for use with LCF. One downside is that installation and deployment will be platform-specific since there are multiple processes and PostgreSQL itself requires a platform-specific installation. This proposal presumes that PostgreSQL is the best option for the foreseeable future, but nothing here is intended to preclude support for other database servers in futures releases. This proposal should not have any impact on QuickStart packaging or deployment. Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-120) Port all connectors to use httpclient 4.x, after we submit our remaining 3.x changes as commons-httpclient tickets
[ https://issues.apache.org/jira/browse/CONNECTORS-120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-120: --- Summary: Port all connectors to use httpclient 4.x, after we submit our remaining 3.x changes as commons-httpclient tickets (was: Port all connectors to use httpclient 4.x, after it is released with full NTLM support) Port all connectors to use httpclient 4.x, after we submit our remaining 3.x changes as commons-httpclient tickets -- Key: CONNECTORS-120 URL: https://issues.apache.org/jira/browse/CONNECTORS-120 Project: ManifoldCF Issue Type: Task Components: LiveLink connector, Meridio connector, RSS connector, SharePoint connector, Web connector Reporter: Karl Wright Now that commons-httpclient has accepted our NTLM patch, we can upgrade our connectors to use their newest 4.x httpclient code. We still need to submit or apply patches for other features first, so this ticket depends on the resolution of that action, covered in CONNECTORS-119. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CONNECTORS-177) File System Connector has some testing code in it
File System Connector has some testing code in it - Key: CONNECTORS-177 URL: https://issues.apache.org/jira/browse/CONNECTORS-177 Project: ManifoldCF Issue Type: Improvement Components: File system connector Affects Versions: ManifoldCF next Reporter: Karl Wright Priority: Minor The file system connector has testing code in it that should be removed. See getBinNames(). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (CONNECTORS-175) The site documentation property list does not include the PostgreSQL-specific parameters, and may be missing some of the Derby ones too
[ https://issues.apache.org/jira/browse/CONNECTORS-175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright reassigned CONNECTORS-175: -- Assignee: Karl Wright The site documentation property list does not include the PostgreSQL-specific parameters, and may be missing some of the Derby ones too --- Key: CONNECTORS-175 URL: https://issues.apache.org/jira/browse/CONNECTORS-175 Project: ManifoldCF Issue Type: Improvement Components: Documentation Affects Versions: ManifoldCF next Reporter: Karl Wright Assignee: Karl Wright Priority: Minor The table that documents all the properties in properties.xml seems to be missing the PostgreSQL-specific ones. This is the how-to-build-and-deploy.html page. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CONNECTORS-175) The site documentation property list does not include the PostgreSQL-specific parameters, and may be missing some of the Derby ones too
[ https://issues.apache.org/jira/browse/CONNECTORS-175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-175. Resolution: Fixed Fix Version/s: ManifoldCF next r1089704. The site documentation property list does not include the PostgreSQL-specific parameters, and may be missing some of the Derby ones too --- Key: CONNECTORS-175 URL: https://issues.apache.org/jira/browse/CONNECTORS-175 Project: ManifoldCF Issue Type: Improvement Components: Documentation Affects Versions: ManifoldCF next Reporter: Karl Wright Assignee: Karl Wright Priority: Minor Fix For: ManifoldCF next The table that documents all the properties in properties.xml seems to be missing the PostgreSQL-specific ones. This is the how-to-build-and-deploy.html page. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CONNECTORS-179) IConnector interface method setThreadContext should throw ManifoldCFException, but doesn't
[ https://issues.apache.org/jira/browse/CONNECTORS-179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-179. Resolution: Fixed Fix Version/s: ManifoldCF next r1092566 IConnector interface method setThreadContext should throw ManifoldCFException, but doesn't -- Key: CONNECTORS-179 URL: https://issues.apache.org/jira/browse/CONNECTORS-179 Project: ManifoldCF Issue Type: Bug Components: Framework core Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF next The IConnector method setThreadContext does not throw ManifoldCFException. This makes it very difficult to use the method for its intended purpose, which is to set up handles that are thread-context dependent. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-179) IConnector interface method setThreadContext should throw ManifoldCFException, but doesn't
[ https://issues.apache.org/jira/browse/CONNECTORS-179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13020163#comment-13020163 ] Karl Wright commented on CONNECTORS-179: Also r1092568. IConnector interface method setThreadContext should throw ManifoldCFException, but doesn't -- Key: CONNECTORS-179 URL: https://issues.apache.org/jira/browse/CONNECTORS-179 Project: ManifoldCF Issue Type: Bug Components: Framework core Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF next The IConnector method setThreadContext does not throw ManifoldCFException. This makes it very difficult to use the method for its intended purpose, which is to set up handles that are thread-context dependent. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CONNECTORS-180) Connector factories all have a Pool class that should be derived from a base Pool class
Connector factories all have a Pool class that should be derived from a base Pool class --- Key: CONNECTORS-180 URL: https://issues.apache.org/jira/browse/CONNECTORS-180 Project: ManifoldCF Issue Type: Improvement Components: Framework core Reporter: Karl Wright Priority: Minor There's a fair bit of duplicated code in the connector factories - RepositoryConnectorFactory, AuthorityConnectorFactory, etc. The duplicated code can be easily eliminated by creating a base factory pool class. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CONNECTORS-181) ICacheDescription's way of handling object expiration is very clumsy for fixed lifetime objects
ICacheDescription's way of handling object expiration is very clumsy for fixed lifetime objects --- Key: CONNECTORS-181 URL: https://issues.apache.org/jira/browse/CONNECTORS-181 Project: ManifoldCF Issue Type: Improvement Components: Framework core Affects Versions: ManifoldCF next Reporter: Karl Wright Assignee: Karl Wright The ICacheDescription getObjectExpirationTime() method works fine for the model where each object access resets the expiration to some point in the future, but it does not work well for the model where the object truly has a fixed expiration time based on its creation time. This is interfering with code for CONNECTORS-32. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CONNECTORS-182) Timed cache invalidation does not work
[ https://issues.apache.org/jira/browse/CONNECTORS-182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-182. Resolution: Fixed Fix Version/s: ManifoldCF next r1094053. Timed cache invalidation does not work -- Key: CONNECTORS-182 URL: https://issues.apache.org/jira/browse/CONNECTORS-182 Project: ManifoldCF Issue Type: Bug Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2, ManifoldCF next Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF next Timed cache invalidation does not work because the polling thread does not apparently manage to periodically call the cache manager to tell it to invalidate. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CONNECTORS-32) A general access-token cache within Authority Service would help performance
[ https://issues.apache.org/jira/browse/CONNECTORS-32?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-32. --- Resolution: Fixed Fix Version/s: ManifoldCF next r1094217. A general access-token cache within Authority Service would help performance Key: CONNECTORS-32 URL: https://issues.apache.org/jira/browse/CONNECTORS-32 Project: ManifoldCF Issue Type: Improvement Components: Authority Service Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF next We should consider adding user-keyed per-connector access token cache within LCF's authority service. Individual connectors should be able to signal how long their cached tokens survive. This would help enormously with the case where dozens of requests for the same user are submitted for every page in the end-user UI. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (CONNECTORS-183) SECURITY_AUTHENTICATION-changeable
[ https://issues.apache.org/jira/browse/CONNECTORS-183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright reassigned CONNECTORS-183: -- Assignee: Karl Wright SECURITY_AUTHENTICATION-changeable -- Key: CONNECTORS-183 URL: https://issues.apache.org/jira/browse/CONNECTORS-183 Project: ManifoldCF Issue Type: Improvement Components: Authority Service Affects Versions: ManifoldCF next Environment: Microsoft Windows Server 2003 R2 Microsoft Windows Server 2008 R2 Reporter: Shinichiro Abe Assignee: Karl Wright Priority: Minor Fix For: ManifoldCF next Attachments: CONNECTORS-183.patch In ActiveDirectoryAuthority.java, env.put(Context.SECURITY_AUTHENTICATION,DIGEST-MD5 GSSAPI); Users may want to change the constant string. In my Windows2003/2008 environment, that does not work unless setting simple. Crawler-ui should allow users to the change authentication. See: http://java.sun.com/products/jndi/jndi-ldap-gl.html java.naming.security.authentication Direciton of improvement(at this time): Crawler-ui allows users to input AUTHENTICATION text value. ActiveDirectoryAuthority support none, simple and the authentication mechanism for the provider to use. ActiveDirectoryAuthority does not support strong, SASL authentication and SSL protocol. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CONNECTORS-187) WorkerThread method processDeleteList does not handle ServiceInterruptions from output connector optimally
WorkerThread method processDeleteList does not handle ServiceInterruptions from output connector optimally -- Key: CONNECTORS-187 URL: https://issues.apache.org/jira/browse/CONNECTORS-187 Project: ManifoldCF Issue Type: Improvement Components: Framework crawler agent Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2, ManifoldCF next Reporter: Karl Wright The processDeleteList method in WorkerThread does not handle ServiceInterruption exceptions optimally; it just waits five minutes and retries. What it should do is requeue all the affected documents for the prescribed time, ignoring the possibility of failure or skip, since neither of these can be performed when the output connection is not working. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (CONNECTORS-188) CRLF line endings in executecommand.sh script
[ https://issues.apache.org/jira/browse/CONNECTORS-188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright reassigned CONNECTORS-188: -- Assignee: Karl Wright CRLF line endings in executecommand.sh script - Key: CONNECTORS-188 URL: https://issues.apache.org/jira/browse/CONNECTORS-188 Project: ManifoldCF Issue Type: Bug Components: Framework agents process Environment: *NIX Reporter: Erlend Garåsen Assignee: Karl Wright Priority: Critical Fix For: ManifoldCF 0.2 The executecommand.sh script cannot be run since CRLF line endings have been added: $ bash -x processes/script/executecommand.sh + $'\r' : command not foundecutecommand.sh: line 2: + $'\r' : command not foundecutecommand.sh: line 17: 'rocesses/script/executecommand.sh: line 30: syntax error near unexpected token `do 'rocesses/script/executecommand.sh: line 30: `for filename in $(ls -1 $MCF_HOME/processes/jar) ; do -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CONNECTORS-188) CRLF line endings in executecommand.sh script
[ https://issues.apache.org/jira/browse/CONNECTORS-188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-188. Resolution: Fixed Fix Version/s: ManifoldCF next r1096922 (trunk) r1096925 (release branch) CRLF line endings in executecommand.sh script - Key: CONNECTORS-188 URL: https://issues.apache.org/jira/browse/CONNECTORS-188 Project: ManifoldCF Issue Type: Bug Components: Framework agents process Environment: *NIX Reporter: Erlend Garåsen Assignee: Karl Wright Priority: Critical Fix For: ManifoldCF 0.2, ManifoldCF next The executecommand.sh script cannot be run since CRLF line endings have been added: $ bash -x processes/script/executecommand.sh + $'\r' : command not foundecutecommand.sh: line 2: + $'\r' : command not foundecutecommand.sh: line 17: 'rocesses/script/executecommand.sh: line 30: syntax error near unexpected token `do 'rocesses/script/executecommand.sh: line 30: `for filename in $(ls -1 $MCF_HOME/processes/jar) ; do -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-189) Add the mail archive links to the mail.html page
[ https://issues.apache.org/jira/browse/CONNECTORS-189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13025533#comment-13025533 ] Karl Wright commented on CONNECTORS-189: Do you want to contribute a patch as well? It's not hard - the file is site/src/documentation/content/xdocs/mail.xml. There are other xml docs there that show how to do links in Forrest, if you grovel around. Add the mail archive links to the mail.html page Key: CONNECTORS-189 URL: https://issues.apache.org/jira/browse/CONNECTORS-189 Project: ManifoldCF Issue Type: Improvement Reporter: Farzad Priority: Minor I think this would best be added to the mail.html page, which describes the mail lists and how to sign up for them. Please feel free to open a jira ticket accordingly. Thanks! Karl On Tue, Apr 26, 2011 at 11:34 AM, conflue...@apache.org wrote: Space: Apache Connectors Framework (https://cwiki.apache.org/confluence/display/CONNECTORS) Page: FAQ (https://cwiki.apache.org/confluence/display/CONNECTORS/FAQ) Comment: https://cwiki.apache.org/confluence/display/CONNECTORS/FAQ?focusedCommentId=26119029#comment-26119029 Comment added by Farzad: - Found the root links, this is nice. Might want to add these to the FAQ. Do you know if there is a way to view snippets of the messages without having to click on each one? http://www.mail-archive.com/connectors-user@incubator.apache.org/index.html http://www.mail-archive.com/connectors-dev@incubator.apache.org/index.html http://www.mail-archive.com/general@incubator.apache.org/index.html In reply to a comment by Karl Wright: The news lists are in fact kept around; you can in fact use google to find old posts. Try googling ManifoldCF eclipse to see what I mean. Change your notification preferences: https://cwiki.apache.org/confluence/users/viewnotifications.action -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-189) Add the mail archive links to the mail.html page
[ https://issues.apache.org/jira/browse/CONNECTORS-189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13025542#comment-13025542 ] Karl Wright commented on CONNECTORS-189: That looks about right. But you'll want to submit it in true patch form, and attach the patch to the ticket. Instructions for how to do that are in the How to contribute page of the wiki. Here's the link: https://cwiki.apache.org/confluence/display/CONNECTORS/HowToContribute Add the mail archive links to the mail.html page Key: CONNECTORS-189 URL: https://issues.apache.org/jira/browse/CONNECTORS-189 Project: ManifoldCF Issue Type: Improvement Reporter: Farzad Priority: Minor I think this would best be added to the mail.html page, which describes the mail lists and how to sign up for them. Please feel free to open a jira ticket accordingly. Thanks! Karl On Tue, Apr 26, 2011 at 11:34 AM, conflue...@apache.org wrote: Space: Apache Connectors Framework (https://cwiki.apache.org/confluence/display/CONNECTORS) Page: FAQ (https://cwiki.apache.org/confluence/display/CONNECTORS/FAQ) Comment: https://cwiki.apache.org/confluence/display/CONNECTORS/FAQ?focusedCommentId=26119029#comment-26119029 Comment added by Farzad: - Found the root links, this is nice. Might want to add these to the FAQ. Do you know if there is a way to view snippets of the messages without having to click on each one? http://www.mail-archive.com/connectors-user@incubator.apache.org/index.html http://www.mail-archive.com/connectors-dev@incubator.apache.org/index.html http://www.mail-archive.com/general@incubator.apache.org/index.html In reply to a comment by Karl Wright: The news lists are in fact kept around; you can in fact use google to find old posts. Try googling ManifoldCF eclipse to see what I mean. Change your notification preferences: https://cwiki.apache.org/confluence/users/viewnotifications.action -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-189) Add the mail archive links to the mail.html page
[ https://issues.apache.org/jira/browse/CONNECTORS-189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-189: --- Resolution: Fixed Fix Version/s: (was: ManifoldCF 0.2) ManifoldCF next Assignee: Karl Wright Status: Resolved (was: Patch Available) Looks good. Committed - r1096998. Add the mail archive links to the mail.html page Key: CONNECTORS-189 URL: https://issues.apache.org/jira/browse/CONNECTORS-189 Project: ManifoldCF Issue Type: Improvement Affects Versions: ManifoldCF 0.2 Reporter: Farzad Assignee: Karl Wright Priority: Minor Fix For: ManifoldCF next Attachments: CONNECTORS-189.patch, mail.xml I think this would best be added to the mail.html page, which describes the mail lists and how to sign up for them. Please feel free to open a jira ticket accordingly. Thanks! Karl On Tue, Apr 26, 2011 at 11:34 AM, conflue...@apache.org wrote: Space: Apache Connectors Framework (https://cwiki.apache.org/confluence/display/CONNECTORS) Page: FAQ (https://cwiki.apache.org/confluence/display/CONNECTORS/FAQ) Comment: https://cwiki.apache.org/confluence/display/CONNECTORS/FAQ?focusedCommentId=26119029#comment-26119029 Comment added by Farzad: - Found the root links, this is nice. Might want to add these to the FAQ. Do you know if there is a way to view snippets of the messages without having to click on each one? http://www.mail-archive.com/connectors-user@incubator.apache.org/index.html http://www.mail-archive.com/connectors-dev@incubator.apache.org/index.html http://www.mail-archive.com/general@incubator.apache.org/index.html In reply to a comment by Karl Wright: The news lists are in fact kept around; you can in fact use google to find old posts. Try googling ManifoldCF eclipse to see what I mean. Change your notification preferences: https://cwiki.apache.org/confluence/users/viewnotifications.action -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CONNECTORS-190) Programmatic operation page has a wiki link in a Forrest page
Programmatic operation page has a wiki link in a Forrest page - Key: CONNECTORS-190 URL: https://issues.apache.org/jira/browse/CONNECTORS-190 Project: ManifoldCF Issue Type: Bug Components: Documentation Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2, ManifoldCF next Reporter: Karl Wright Priority: Minor The following wiki markup is present in the programmatic-operation.xml page: [here|http://www.json.org] This should be an a href tag instead. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CONNECTORS-190) Programmatic operation page has a wiki link in a Forrest page
[ https://issues.apache.org/jira/browse/CONNECTORS-190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-190. Resolution: Fixed Assignee: Karl Wright r1097192. Programmatic operation page has a wiki link in a Forrest page - Key: CONNECTORS-190 URL: https://issues.apache.org/jira/browse/CONNECTORS-190 Project: ManifoldCF Issue Type: Bug Components: Documentation Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2, ManifoldCF next Reporter: Karl Wright Assignee: Karl Wright Priority: Minor The following wiki markup is present in the programmatic-operation.xml page: [here|http://www.json.org] This should be an a href tag instead. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-189) Add the mail archive links to the mail.html page
[ https://issues.apache.org/jira/browse/CONNECTORS-189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13026154#comment-13026154 ] Karl Wright commented on CONNECTORS-189: Please! What did you have in mind? (Bear in mind that there's already a Lucid Imagination search box at the top.) Add the mail archive links to the mail.html page Key: CONNECTORS-189 URL: https://issues.apache.org/jira/browse/CONNECTORS-189 Project: ManifoldCF Issue Type: Improvement Affects Versions: ManifoldCF 0.2 Reporter: Farzad Assignee: Karl Wright Priority: Minor Fix For: ManifoldCF next Attachments: CONNECTORS-189.patch, mail.xml I think this would best be added to the mail.html page, which describes the mail lists and how to sign up for them. Please feel free to open a jira ticket accordingly. Thanks! Karl On Tue, Apr 26, 2011 at 11:34 AM, conflue...@apache.org wrote: Space: Apache Connectors Framework (https://cwiki.apache.org/confluence/display/CONNECTORS) Page: FAQ (https://cwiki.apache.org/confluence/display/CONNECTORS/FAQ) Comment: https://cwiki.apache.org/confluence/display/CONNECTORS/FAQ?focusedCommentId=26119029#comment-26119029 Comment added by Farzad: - Found the root links, this is nice. Might want to add these to the FAQ. Do you know if there is a way to view snippets of the messages without having to click on each one? http://www.mail-archive.com/connectors-user@incubator.apache.org/index.html http://www.mail-archive.com/connectors-dev@incubator.apache.org/index.html http://www.mail-archive.com/general@incubator.apache.org/index.html In reply to a comment by Karl Wright: The news lists are in fact kept around; you can in fact use google to find old posts. Try googling ManifoldCF eclipse to see what I mean. Change your notification preferences: https://cwiki.apache.org/confluence/users/viewnotifications.action -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CONNECTORS-191) All .bat files should have CRLF attribute in svn
All .bat files should have CRLF attribute in svn Key: CONNECTORS-191 URL: https://issues.apache.org/jira/browse/CONNECTORS-191 Project: ManifoldCF Issue Type: Bug Components: Documentum connector, FileNet connector, Framework agents process Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2, ManifoldCF next Reporter: Karl Wright Assignee: Karl Wright Priority: Minor All .bat files should have svn:eol-style CRLF in svn, so they always have cr/lf endings. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CONNECTORS-191) All .bat files should have CRLF attribute in svn
[ https://issues.apache.org/jira/browse/CONNECTORS-191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-191. Resolution: Fixed Fix Version/s: ManifoldCF next r1097349 All .bat files should have CRLF attribute in svn Key: CONNECTORS-191 URL: https://issues.apache.org/jira/browse/CONNECTORS-191 Project: ManifoldCF Issue Type: Bug Components: Documentum connector, FileNet connector, Framework agents process Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2, ManifoldCF next Reporter: Karl Wright Assignee: Karl Wright Priority: Minor Fix For: ManifoldCF next All .bat files should have svn:eol-style CRLF in svn, so they always have cr/lf endings. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CONNECTORS-192) Job-related specification post method sometimes called without corresponding specification header/body
Job-related specification post method sometimes called without corresponding specification header/body -- Key: CONNECTORS-192 URL: https://issues.apache.org/jira/browse/CONNECTORS-192 Project: ManifoldCF Issue Type: Bug Components: Framework crawler agent Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2, ManifoldCF next Reporter: Karl Wright Assignee: Karl Wright For specification tabs, sometimes the specification post method is called when the corresponding specification header/body method wasn't. This can happen for both repository and output connectors. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CONNECTORS-193) Not all output connectors adhere to the standard convention for naming of tabs, form elements, and javascript methods
Not all output connectors adhere to the standard convention for naming of tabs, form elements, and javascript methods - Key: CONNECTORS-193 URL: https://issues.apache.org/jira/browse/CONNECTORS-193 Project: ManifoldCF Issue Type: Bug Components: GTS connector, Lucene/SOLR connector Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2, ManifoldCF next Reporter: Karl Wright The convention for form elements and javascript methods is that all element names and methods must begin with lowercase oc. The convention for output specification tabs is that the tab name should contain the name of the target, e.g. GTS Parameters or Solr Metadata Mapping. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CONNECTORS-194) Forrest doc build always gets an error because of relative references to javadoc roots
Forrest doc build always gets an error because of relative references to javadoc roots -- Key: CONNECTORS-194 URL: https://issues.apache.org/jira/browse/CONNECTORS-194 Project: ManifoldCF Issue Type: Improvement Components: Documentation Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2, ManifoldCF next Reporter: Karl Wright Forrest is not very happy with generating a relative link to the javadoc roots, since the javadoc itself is not under Forrest's control. Somebody needs to find a better way of handling this. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CONNECTORS-195) Active directory authority doesn't handle unknown user case properly
Active directory authority doesn't handle unknown user case properly Key: CONNECTORS-195 URL: https://issues.apache.org/jira/browse/CONNECTORS-195 Project: ManifoldCF Issue Type: Bug Components: Active Directory authority Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2, ManifoldCF next Reporter: Karl Wright The active directory authority does not properly detect an non-existing user in Active Directory. Instead it returns S-1-1-0, which permits the unknown user to see all public documents. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-195) Active directory authority doesn't handle unknown user case properly
[ https://issues.apache.org/jira/browse/CONNECTORS-195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-195: --- Attachment: CONNECTORS-195.patch Patch which may work to resolve the issue Active directory authority doesn't handle unknown user case properly Key: CONNECTORS-195 URL: https://issues.apache.org/jira/browse/CONNECTORS-195 Project: ManifoldCF Issue Type: Bug Components: Active Directory authority Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2, ManifoldCF next Reporter: Karl Wright Attachments: CONNECTORS-195.patch The active directory authority does not properly detect an non-existing user in Active Directory. Instead it returns S-1-1-0, which permits the unknown user to see all public documents. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-195) Active directory authority doesn't handle unknown user case properly
[ https://issues.apache.org/jira/browse/CONNECTORS-195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028090#comment-13028090 ] Karl Wright commented on CONNECTORS-195: The patch requires the name of an attribute that all users have. uid is what it uses now. Online references are not clear on whether or not this will always work with Active Directory. It especially does not seem to exist for Windows 2000. Another suggestion is sAMAccountName, which exists for all versions of Windows. Replacing uid in the patch with sAMAccountName may therefore make it work better. Active directory authority doesn't handle unknown user case properly Key: CONNECTORS-195 URL: https://issues.apache.org/jira/browse/CONNECTORS-195 Project: ManifoldCF Issue Type: Bug Components: Active Directory authority Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2, ManifoldCF next Reporter: Karl Wright Attachments: CONNECTORS-195.patch The active directory authority does not properly detect an non-existing user in Active Directory. Instead it returns S-1-1-0, which permits the unknown user to see all public documents. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-195) Active directory authority doesn't handle unknown user case properly
[ https://issues.apache.org/jira/browse/CONNECTORS-195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028091#comment-13028091 ] Karl Wright commented on CONNECTORS-195: The following reference is very helpful. http://msdn.microsoft.com/en-us/library/ms679635%28v=VS.85%29.aspx Active directory authority doesn't handle unknown user case properly Key: CONNECTORS-195 URL: https://issues.apache.org/jira/browse/CONNECTORS-195 Project: ManifoldCF Issue Type: Bug Components: Active Directory authority Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2, ManifoldCF next Reporter: Karl Wright Attachments: CONNECTORS-195.patch The active directory authority does not properly detect an non-existing user in Active Directory. Instead it returns S-1-1-0, which permits the unknown user to see all public documents. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-194) Forrest doc build always gets an error because of relative references to javadoc roots
[ https://issues.apache.org/jira/browse/CONNECTORS-194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028282#comment-13028282 ] Karl Wright commented on CONNECTORS-194: cli-xconf seems definitely the way to go. But we're still going to need a jumping-off page that is handled by Forrest (which is what javadoc.html does for us) because not all connectors are buildable or can be javadoc'd, depending on the existence of the needed third-party libraries. If the api subdirectory is the new place where the javadoc roots are all put, that's fine by me. Forrest doc build always gets an error because of relative references to javadoc roots -- Key: CONNECTORS-194 URL: https://issues.apache.org/jira/browse/CONNECTORS-194 Project: ManifoldCF Issue Type: Improvement Components: Documentation Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2, ManifoldCF next Reporter: Karl Wright Forrest is not very happy with generating a relative link to the javadoc roots, since the javadoc itself is not under Forrest's control. Somebody needs to find a better way of handling this. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Issue Comment Edited] (CONNECTORS-194) Forrest doc build always gets an error because of relative references to javadoc roots
[ https://issues.apache.org/jira/browse/CONNECTORS-194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028282#comment-13028282 ] Karl Wright edited comment on CONNECTORS-194 at 5/3/11 4:08 PM: cli-xconf seems definitely the way to go. But we're still going to need a jumping-off page that is handled by Forrest (which is what javadoc.html does for us). Moving javadoc.html to api/index.html for this purpose is also OK if we can get forrest to work with it properly in that location. If the api subdirectory is the new place where the javadoc roots are all put, that's fine by me. was (Author: kwri...@metacarta.com): cli-xconf seems definitely the way to go. But we're still going to need a jumping-off page that is handled by Forrest (which is what javadoc.html does for us) because not all connectors are buildable or can be javadoc'd, depending on the existence of the needed third-party libraries. If the api subdirectory is the new place where the javadoc roots are all put, that's fine by me. Forrest doc build always gets an error because of relative references to javadoc roots -- Key: CONNECTORS-194 URL: https://issues.apache.org/jira/browse/CONNECTORS-194 Project: ManifoldCF Issue Type: Improvement Components: Documentation Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2, ManifoldCF next Reporter: Karl Wright Forrest is not very happy with generating a relative link to the javadoc roots, since the javadoc itself is not under Forrest's control. Somebody needs to find a better way of handling this. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (CONNECTORS-195) Active directory authority doesn't handle unknown user case properly
[ https://issues.apache.org/jira/browse/CONNECTORS-195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright reassigned CONNECTORS-195: -- Assignee: Karl Wright Active directory authority doesn't handle unknown user case properly Key: CONNECTORS-195 URL: https://issues.apache.org/jira/browse/CONNECTORS-195 Project: ManifoldCF Issue Type: Bug Components: Active Directory authority Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2, ManifoldCF next Reporter: Karl Wright Assignee: Karl Wright Attachments: CONNECTORS-195.patch The active directory authority does not properly detect an non-existing user in Active Directory. Instead it returns S-1-1-0, which permits the unknown user to see all public documents. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CONNECTORS-195) Active directory authority doesn't handle unknown user case properly
[ https://issues.apache.org/jira/browse/CONNECTORS-195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-195. Resolution: Fixed Fix Version/s: ManifoldCF next Verified that the committed fix does the expected thing on a certain user's setup. Awaiting final verification that it does not break a user with a correct setup, although this would be extremely unlikely. Active directory authority doesn't handle unknown user case properly Key: CONNECTORS-195 URL: https://issues.apache.org/jira/browse/CONNECTORS-195 Project: ManifoldCF Issue Type: Bug Components: Active Directory authority Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2, ManifoldCF next Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF next Attachments: CONNECTORS-195.patch The active directory authority does not properly detect an non-existing user in Active Directory. Instead it returns S-1-1-0, which permits the unknown user to see all public documents. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CONNECTORS-196) Active directory authority doesn't work if login name and common name differ
[ https://issues.apache.org/jira/browse/CONNECTORS-196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-196. Resolution: Fixed Fix Version/s: ManifoldCF next r1100090 Active directory authority doesn't work if login name and common name differ Key: CONNECTORS-196 URL: https://issues.apache.org/jira/browse/CONNECTORS-196 Project: ManifoldCF Issue Type: Bug Components: Active Directory authority Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2, ManifoldCF next Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF next Active directory authority will not work if common name (cn) and login name (sAMAccountName) differ. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CONNECTORS-196) Active directory authority doesn't work if login name and common name differ
Active directory authority doesn't work if login name and common name differ Key: CONNECTORS-196 URL: https://issues.apache.org/jira/browse/CONNECTORS-196 Project: ManifoldCF Issue Type: Bug Components: Active Directory authority Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2, ManifoldCF next Reporter: Karl Wright Assignee: Karl Wright Active directory authority will not work if common name (cn) and login name (sAMAccountName) differ. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-196) Active directory authority doesn't work if login name and common name differ
[ https://issues.apache.org/jira/browse/CONNECTORS-196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13029779#comment-13029779 ] Karl Wright commented on CONNECTORS-196: Also, r1100097. Active directory authority doesn't work if login name and common name differ Key: CONNECTORS-196 URL: https://issues.apache.org/jira/browse/CONNECTORS-196 Project: ManifoldCF Issue Type: Bug Components: Active Directory authority Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2, ManifoldCF next Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF next Active directory authority will not work if common name (cn) and login name (sAMAccountName) differ. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-194) Forrest doc build always gets an error because of relative references to javadoc roots
[ https://issues.apache.org/jira/browse/CONNECTORS-194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13030551#comment-13030551 ] Karl Wright commented on CONNECTORS-194: Patch looks great! I'll commit it tomorrow morning. Forrest doc build always gets an error because of relative references to javadoc roots -- Key: CONNECTORS-194 URL: https://issues.apache.org/jira/browse/CONNECTORS-194 Project: ManifoldCF Issue Type: Improvement Components: Documentation Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2, ManifoldCF next Reporter: Karl Wright Attachments: CONNECTORS-194.patch Forrest is not very happy with generating a relative link to the javadoc roots, since the javadoc itself is not under Forrest's control. Somebody needs to find a better way of handling this. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-197) Active directory authority provides a compatibility switch for getting SID.
[ https://issues.apache.org/jira/browse/CONNECTORS-197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13030568#comment-13030568 ] Karl Wright commented on CONNECTORS-197: You might want to have a look at the changes I committed today from Kadri. I think your patch may need to be changed to just substitute the uid attribute for the sAMAccountName attribute when the switch is set. Active directory authority provides a compatibility switch for getting SID. Key: CONNECTORS-197 URL: https://issues.apache.org/jira/browse/CONNECTORS-197 Project: ManifoldCF Issue Type: Improvement Components: Active Directory authority Reporter: Shinichiro Abe Priority: Minor Fix For: ManifoldCF next Attachments: CONNECTORS-197-temp.patch When using /UserACLs?username=foo@bar, MCF always refers to samAccountName now. Size of samAccountName is specified as less than 20 characters. Size of Login Name is specified as over 20 characters(256). if a user does not support old version of OS and support only new version, it is hard for ManifoldCF to restrict 20 characters of Login name. We want a compatibility switch in the configuration switch. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CONNECTORS-198) The build.xml rat-source target complains about files in the test-output-postgres folders
The build.xml rat-source target complains about files in the test-output-postgres folders - Key: CONNECTORS-198 URL: https://issues.apache.org/jira/browse/CONNECTORS-198 Project: ManifoldCF Issue Type: Bug Affects Versions: ManifoldCF 0.2, ManifoldCF next Reporter: Karl Wright Assignee: Karl Wright If you run ant rat-sources, you get complaints about files under various test-output-postgresql folders, which should be excluded. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CONNECTORS-198) The build.xml rat-source target complains about files in the test-output-postgres folders
[ https://issues.apache.org/jira/browse/CONNECTORS-198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-198. Resolution: Fixed Fix Version/s: ManifoldCF next r1100916. The build.xml rat-source target complains about files in the test-output-postgres folders - Key: CONNECTORS-198 URL: https://issues.apache.org/jira/browse/CONNECTORS-198 Project: ManifoldCF Issue Type: Bug Affects Versions: ManifoldCF 0.2, ManifoldCF next Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF next If you run ant rat-sources, you get complaints about files under various test-output-postgresql folders, which should be excluded. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-197) Active directory authority provides a compatibility switch for getting SID.
[ https://issues.apache.org/jira/browse/CONNECTORS-197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13030669#comment-13030669 ] Karl Wright commented on CONNECTORS-197: In the patch, the following lines look incorrect: +option value=\sAMAccountName\+(org.apache.manifoldcf.ui.util.Encoder.attributeEscape(userACLsUsername).equals(sAMAccountName)? selected=\true\:)+sAMAccountName/option\n+ +option value=\userPrincipalName\+(org.apache.manifoldcf.ui.util.Encoder.attributeEscape(userACLsUsername).equals(userPrincipalName)? selected=\true\:)+userPrincipalName/option\n+ You want to attributeEscape the value attribute, not the equals compare. Other than that, the patch looks good. Would you like to fix this and I will go ahead and commit it? Active directory authority provides a compatibility switch for getting SID. Key: CONNECTORS-197 URL: https://issues.apache.org/jira/browse/CONNECTORS-197 Project: ManifoldCF Issue Type: Improvement Components: Active Directory authority Reporter: Shinichiro Abe Priority: Minor Fix For: ManifoldCF next Attachments: CONNECTORS-197-temp.patch, CONNECTORS-197.patch When using /UserACLs?username=foo@bar, MCF always refers to samAccountName now. Size of samAccountName is specified as less than 20 characters. Size of Login Name is specified as over 20 characters(256). if a user does not support old version of OS and support only new version, it is hard for ManifoldCF to restrict 20 characters of Login name. We want a compatibility switch in the configuration switch. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-197) Active directory authority provides a compatibility switch for getting SID.
[ https://issues.apache.org/jira/browse/CONNECTORS-197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13030670#comment-13030670 ] Karl Wright commented on CONNECTORS-197: It looks like I confused Jira on the previous comment ;-) Hopefully you will be able to figure out what I meant. Active directory authority provides a compatibility switch for getting SID. Key: CONNECTORS-197 URL: https://issues.apache.org/jira/browse/CONNECTORS-197 Project: ManifoldCF Issue Type: Improvement Components: Active Directory authority Reporter: Shinichiro Abe Priority: Minor Fix For: ManifoldCF next Attachments: CONNECTORS-197-temp.patch, CONNECTORS-197.patch When using /UserACLs?username=foo@bar, MCF always refers to samAccountName now. Size of samAccountName is specified as less than 20 characters. Size of Login Name is specified as over 20 characters(256). if a user does not support old version of OS and support only new version, it is hard for ManifoldCF to restrict 20 characters of Login name. We want a compatibility switch in the configuration switch. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-174) The standard logging.ini file for the Quick Start should set a log format that includes at least date and time
[ https://issues.apache.org/jira/browse/CONNECTORS-174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-174: --- Resolution: Fixed Fix Version/s: ManifoldCF next Status: Resolved (was: Patch Available) r1100971. The standard logging.ini file for the Quick Start should set a log format that includes at least date and time -- Key: CONNECTORS-174 URL: https://issues.apache.org/jira/browse/CONNECTORS-174 Project: ManifoldCF Issue Type: Improvement Components: Examples Affects Versions: ManifoldCF next Reporter: Karl Wright Priority: Minor Fix For: ManifoldCF next Attachments: CONNECTORS-174.patch The log format as currently set by default for the Quick Start could be better if it included a date, time, and maybe a thread ID. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CONNECTORS-197) Active directory authority provides a compatibility switch for getting SID.
[ https://issues.apache.org/jira/browse/CONNECTORS-197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-197. Resolution: Fixed Assignee: Karl Wright r1100987. Active directory authority provides a compatibility switch for getting SID. Key: CONNECTORS-197 URL: https://issues.apache.org/jira/browse/CONNECTORS-197 Project: ManifoldCF Issue Type: Improvement Components: Active Directory authority Reporter: Shinichiro Abe Assignee: Karl Wright Priority: Minor Fix For: ManifoldCF next Attachments: CONNECTORS-197-temp.patch, CONNECTORS-197.patch When using /UserACLs?username=foo@bar, MCF always refers to samAccountName now. Size of samAccountName is specified as less than 20 characters. Size of Login Name is specified as over 20 characters(256). if a user does not support old version of OS and support only new version, it is hard for ManifoldCF to restrict 20 characters of Login name. We want a compatibility switch in the configuration switch. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-197) Active directory authority provides a compatibility switch for getting SID.
[ https://issues.apache.org/jira/browse/CONNECTORS-197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13030711#comment-13030711 ] Karl Wright commented on CONNECTORS-197: Yes, you are correct. The only thing I changed from your second patch was to add support for backwards compatibility in case the new parameter is not there. Active directory authority provides a compatibility switch for getting SID. Key: CONNECTORS-197 URL: https://issues.apache.org/jira/browse/CONNECTORS-197 Project: ManifoldCF Issue Type: Improvement Components: Active Directory authority Reporter: Shinichiro Abe Assignee: Karl Wright Priority: Minor Fix For: ManifoldCF next Attachments: CONNECTORS-197-temp.patch, CONNECTORS-197.patch When using /UserACLs?username=foo@bar, MCF always refers to samAccountName now. Size of samAccountName is specified as less than 20 characters. Size of Login Name is specified as over 20 characters(256). if a user does not support old version of OS and support only new version, it is hard for ManifoldCF to restrict 20 characters of Login name. We want a compatibility switch in the configuration switch. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CONNECTORS-199) Modify site release page to include new release
Modify site release page to include new release --- Key: CONNECTORS-199 URL: https://issues.apache.org/jira/browse/CONNECTORS-199 Project: ManifoldCF Issue Type: Task Components: Documentation Affects Versions: ManifoldCF next Reporter: Karl Wright The site release page needs to be modified, so that the site points to the new release (0.2-incubating). Also, the PostgreSQL caveat only applies to the 0.1-incubating release, and will not to the 0.2-incubating release. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CONNECTORS-200) Solr connector should treat TikaException the same as a 400 response
Solr connector should treat TikaException the same as a 400 response Key: CONNECTORS-200 URL: https://issues.apache.org/jira/browse/CONNECTORS-200 Project: ManifoldCF Issue Type: Improvement Components: Lucene/SOLR connector Affects Versions: ManifoldCF 0.2, ManifoldCF 0.1, ManifoldCF 0.3 Reporter: Karl Wright Assignee: Karl Wright Solr connector should treat TikaException the same as a 400 response, which is to skip the document. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-200) Solr connector should treat TikaException the same as a 400 response
[ https://issues.apache.org/jira/browse/CONNECTORS-200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036166#comment-13036166 ] Karl Wright commented on CONNECTORS-200: r1124712 is a trial fix. Solr connector should treat TikaException the same as a 400 response Key: CONNECTORS-200 URL: https://issues.apache.org/jira/browse/CONNECTORS-200 Project: ManifoldCF Issue Type: Improvement Components: Lucene/SOLR connector Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2, ManifoldCF 0.3 Reporter: Karl Wright Assignee: Karl Wright Solr connector should treat TikaException the same as a 400 response, which is to skip the document. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CONNECTORS-200) Solr connector should treat TikaException the same as a 400 response
[ https://issues.apache.org/jira/browse/CONNECTORS-200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-200. Resolution: Fixed Fix Version/s: ManifoldCF 0.3 r1125333 Solr connector should treat TikaException the same as a 400 response Key: CONNECTORS-200 URL: https://issues.apache.org/jira/browse/CONNECTORS-200 Project: ManifoldCF Issue Type: Improvement Components: Lucene/SOLR connector Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2, ManifoldCF 0.3 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 0.3 Solr connector should treat TikaException the same as a 400 response, which is to skip the document. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CONNECTORS-201) Carrydown methods should have their own interface class
Carrydown methods should have their own interface class --- Key: CONNECTORS-201 URL: https://issues.apache.org/jira/browse/CONNECTORS-201 Project: ManifoldCF Issue Type: Improvement Components: Framework crawler agent Affects Versions: ManifoldCF 0.3 Reporter: Karl Wright Priority: Minor The carrydown methods are shared in IVersionActivity and IProcessActivity. They ought to have their own interface. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CONNECTORS-201) Carrydown methods should have their own interface class
[ https://issues.apache.org/jira/browse/CONNECTORS-201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-201. Resolution: Fixed Fix Version/s: ManifoldCF 0.3 r1126427 Carrydown methods should have their own interface class --- Key: CONNECTORS-201 URL: https://issues.apache.org/jira/browse/CONNECTORS-201 Project: ManifoldCF Issue Type: Improvement Components: Framework crawler agent Affects Versions: ManifoldCF 0.3 Reporter: Karl Wright Priority: Minor Fix For: ManifoldCF 0.3 The carrydown methods are shared in IVersionActivity and IProcessActivity. They ought to have their own interface. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-19) Look into converting SOLR connector to use SolrJ java library
[ https://issues.apache.org/jira/browse/CONNECTORS-19?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038360#comment-13038360 ] Karl Wright commented on CONNECTORS-19: --- The promised patch never materialized. One point, though, is that ManifoldCF is not single-threaded in any case, so you'd be unlikely to gain much in performance by going multithread on an already multi-threaded connector implementation. The current connector can maintain and use as many connections to Solr as you tell it. Memory buffering on the client side also is not a good idea because it violates the basic ManifoldCF principle that you can safely shut down and restart ManifoldCF at any time without loss. Solr also suffers from lack of a guaranteed delivery metaphor, which I've talked to the Solr team about in the past. The Solr commit model currently does not work this way but ManifoldCF really requires it, because without it there is no way to properly implement an incremental crawler. This would mean a significant new Solr feature. Look into converting SOLR connector to use SolrJ java library - Key: CONNECTORS-19 URL: https://issues.apache.org/jira/browse/CONNECTORS-19 Project: ManifoldCF Issue Type: Improvement Components: Lucene/SOLR connector Reporter: Karl Wright Priority: Minor The SOLR connector currently uses its own multipart post code. It might be a good idea to convert it to use the SolrJ client api jar instead. This would require license confirmation, plus research to make sure there are no jar conflicts as a result, with any other connector. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-19) Look into converting SOLR connector to use SolrJ java library
[ https://issues.apache.org/jira/browse/CONNECTORS-19?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038521#comment-13038521 ] Karl Wright commented on CONNECTORS-19: --- That's why this ticket was created - to explore using solrj instead of the homegrown code currently in the connector. However, there are issues we need to consider before solrj would be an option. The guaranteed delivery problem is one such. But also if SolrJ spins up its own threads it might well make it difficult to shut ManifoldCF down properly, depending on how those threads are created. Just as it is better to use an application server's thread pool when you are a web application, the same principles apply for threads created by connectors and their supporting libraries. If you have access to ManifoldCF in Action, you might want to have a look at chapters 5 and 6 for details. However, that does not rule solrj out, it just means we need to be cautious if and when the Solr connector is transitioned to use it. If you want to explore this in detail by all means feel free - patches are definitely welcome. Look into converting SOLR connector to use SolrJ java library - Key: CONNECTORS-19 URL: https://issues.apache.org/jira/browse/CONNECTORS-19 Project: ManifoldCF Issue Type: Improvement Components: Lucene/SOLR connector Reporter: Karl Wright Priority: Minor The SOLR connector currently uses its own multipart post code. It might be a good idea to convert it to use the SolrJ client api jar instead. This would require license confirmation, plus research to make sure there are no jar conflicts as a result, with any other connector. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-202) SOLR connector suport for commitWithin
[ https://issues.apache.org/jira/browse/CONNECTORS-202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038658#comment-13038658 ] Karl Wright commented on CONNECTORS-202: Yes, making it explicit is preferred. But I thought you wanted to be able to set this on a per-job basis? SOLR connector suport for commitWithin -- Key: CONNECTORS-202 URL: https://issues.apache.org/jira/browse/CONNECTORS-202 Project: ManifoldCF Issue Type: Improvement Components: Lucene/SOLR connector Affects Versions: ManifoldCF 0.2 Reporter: Jan Høydahl Labels: commit The output connection must support commitWithin (http://wiki.apache.org/solr/UpdateXmlMessages#Optional_attributes_for_.22add.22) in addition to sending a commit() at the end of a job. This allows for efficient handling of commits on the Solr side. The parameter should ideally be configurable per job. In that way you could say that for Important job commitWithin=10s while for Big crawl job, commitWithin=600s. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-203) Consider porting ManifoldCF to Java 1.5 code standards
[ https://issues.apache.org/jira/browse/CONNECTORS-203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13041503#comment-13041503 ] Karl Wright commented on CONNECTORS-203: I've created a branch (branches/CONNECTORS-203) for this work, and have begun the changes there. If the community agrees we should do this, we can finish the generics work in the branch and commit the whole thing into trunk. Consider porting ManifoldCF to Java 1.5 code standards -- Key: CONNECTORS-203 URL: https://issues.apache.org/jira/browse/CONNECTORS-203 Project: ManifoldCF Issue Type: Improvement Components: Active Directory authority, Authority Service, Build, Documentation, Documentum connector, File system connector, FileNet connector, Framework agents process, Framework core, Framework crawler agent, GTS connector, JCIFS connector, JDBC connector, LiveLink connector, Lucene/SOLR connector, Meridio connector, RSS connector, SharePoint connector, Web connector Affects Versions: ManifoldCF 0.3 Reporter: Karl Wright Consider porting ManifoldCF to Java 1.5 standards. This includes (but is not limited to): - build files - removing use of enum variable name - introducing generics in both implementation code and interfaces (cautiously) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-110) Max activity and Max bandwidth reports don't work properly under Derby
[ https://issues.apache.org/jira/browse/CONNECTORS-110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13042655#comment-13042655 ] Karl Wright commented on CONNECTORS-110: HSQLDB is now also in roughly the same situation, although I've gotten a rough outline of a way to make this work involving temporary tables. This is as follows: SELECT * FROM (SELECT DISTINCT customerid FROM invoice) AS i_one, LATERAL ( SELECT id, total FROM invoice WHERE customerid = i_one.customerid ORDER BY total DESC LIMIT 1) AS i_two ... where invoice would be a temporary table created on the fly, as follows: DECLARE LOCAL TEMPORARY TABLE T AS (SELECT statement) [ON COMMIT { PRESERVE | DELETE } ROWS] For example: DECLARE LOCAL TEMPORARY TABLE invoice AS (SELECT * FROM whatever) ON COMMIT DELETE ROWS WITH DATA then perform the kind of query I suggested. The issue is that this does not fit in a our single-query abstraction metaphor at all. Maybe a (different but identically named) stored procedure could be generated on all three databases that would do the trick. Alternatively, all databases could go the temporary table route, but then PostgreSQL would be unnecessarily crippled. Max activity and Max bandwidth reports don't work properly under Derby -- Key: CONNECTORS-110 URL: https://issues.apache.org/jira/browse/CONNECTORS-110 Project: ManifoldCF Issue Type: Bug Components: Framework crawler agent Reporter: Karl Wright The reason for the failure is because the queries used are doing the Postgresql DISTINCT ON (xxx) syntax, which Derby does not support. Unfortunately, there does not seem to be a way in Derby at present to do anything similar to DISTINCT ON (xxx), and the queries really can't be done without that. One option is to introduce a getCapabilities() method into the database implementation, which would allow ACF to query the database capabilities before even presenting the report in the navigation menu in the UI. Another alternative is to do a sizable chunk of resultset processing within ACF, which would require not only the DISTINCT ON() implementation, but also the enclosing sort and limit stuff. It's the latter that would be most challenging, because of the difficulties with i18n etc. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-110) Max activity and Max bandwidth reports don't work properly under Derby or HSQLDB
[ https://issues.apache.org/jira/browse/CONNECTORS-110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-110: --- Summary: Max activity and Max bandwidth reports don't work properly under Derby or HSQLDB (was: Max activity and Max bandwidth reports don't work properly under Derby) Max activity and Max bandwidth reports don't work properly under Derby or HSQLDB Key: CONNECTORS-110 URL: https://issues.apache.org/jira/browse/CONNECTORS-110 Project: ManifoldCF Issue Type: Bug Components: Framework crawler agent Reporter: Karl Wright The reason for the failure is because the queries used are doing the Postgresql DISTINCT ON (xxx) syntax, which Derby does not support. Unfortunately, there does not seem to be a way in Derby at present to do anything similar to DISTINCT ON (xxx), and the queries really can't be done without that. One option is to introduce a getCapabilities() method into the database implementation, which would allow ACF to query the database capabilities before even presenting the report in the navigation menu in the UI. Another alternative is to do a sizable chunk of resultset processing within ACF, which would require not only the DISTINCT ON() implementation, but also the enclosing sort and limit stuff. It's the latter that would be most challenging, because of the difficulties with i18n etc. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CONNECTORS-204) Now that HSQLDB functions with ManifoldCF, write a test-hsqldb ant target to test it
Now that HSQLDB functions with ManifoldCF, write a test-hsqldb ant target to test it Key: CONNECTORS-204 URL: https://issues.apache.org/jira/browse/CONNECTORS-204 Project: ManifoldCF Issue Type: Improvement Components: Build Reporter: Karl Wright The latest HSQLDB fixes and features make it an attractive alternative to Derby. But we need a test target that exercises it. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-203) Consider porting ManifoldCF to Java 1.5 code standards
[ https://issues.apache.org/jira/browse/CONNECTORS-203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13042724#comment-13042724 ] Karl Wright commented on CONNECTORS-203: I've merged in all the major interface changes into trunk in r1130475. The branch can now go away and further changes be made incrementally on trunk. Consider porting ManifoldCF to Java 1.5 code standards -- Key: CONNECTORS-203 URL: https://issues.apache.org/jira/browse/CONNECTORS-203 Project: ManifoldCF Issue Type: Improvement Components: Active Directory authority, Authority Service, Build, Documentation, Documentum connector, File system connector, FileNet connector, Framework agents process, Framework core, Framework crawler agent, GTS connector, JCIFS connector, JDBC connector, LiveLink connector, Lucene/SOLR connector, Meridio connector, RSS connector, SharePoint connector, Web connector Affects Versions: ManifoldCF 0.3 Reporter: Karl Wright Consider porting ManifoldCF to Java 1.5 standards. This includes (but is not limited to): - build files - removing use of enum variable name - introducing generics in both implementation code and interfaces (cautiously) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CONNECTORS-205) Database DISTINCT ON abstraction needs to include ordering information in order to work for HSQLDB
Database DISTINCT ON abstraction needs to include ordering information in order to work for HSQLDB -- Key: CONNECTORS-205 URL: https://issues.apache.org/jira/browse/CONNECTORS-205 Project: ManifoldCF Issue Type: Bug Components: Framework core, Framework crawler agent Reporter: Karl Wright The constructDistinctOnClause database method cannot support HSQLDB because it presumes that the ORDER BY clause is already part of the base query. This blocks us from using the HSQLDB WITH/LATERAL temporary table solution for the functionality. Adding ORDER BY information to the abstraction should work for all databases. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-110) Max activity and Max bandwidth reports don't work properly under Derby or HSQLDB
[ https://issues.apache.org/jira/browse/CONNECTORS-110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13042861#comment-13042861 ] Karl Wright commented on CONNECTORS-110: r1130644 implements this for HSQLDB. Unfortunately, performance is extremely slow, even when the number of rows in the temporary table is only a few dozen. Max activity and Max bandwidth reports don't work properly under Derby or HSQLDB Key: CONNECTORS-110 URL: https://issues.apache.org/jira/browse/CONNECTORS-110 Project: ManifoldCF Issue Type: Bug Components: Framework crawler agent Reporter: Karl Wright The reason for the failure is because the queries used are doing the Postgresql DISTINCT ON (xxx) syntax, which Derby does not support. Unfortunately, there does not seem to be a way in Derby at present to do anything similar to DISTINCT ON (xxx), and the queries really can't be done without that. One option is to introduce a getCapabilities() method into the database implementation, which would allow ACF to query the database capabilities before even presenting the report in the navigation menu in the UI. Another alternative is to do a sizable chunk of resultset processing within ACF, which would require not only the DISTINCT ON() implementation, but also the enclosing sort and limit stuff. It's the latter that would be most challenging, because of the difficulties with i18n etc. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-114) Derby seems too unstable in multithreaded situations to be a good database for ManifoldCF, so try to add support for HSQLDB
[ https://issues.apache.org/jira/browse/CONNECTORS-114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13043369#comment-13043369 ] Karl Wright commented on CONNECTORS-114: Remaining issues with HSQLDB have been resolved, so I'm closing this ticket. r1131056. Derby seems too unstable in multithreaded situations to be a good database for ManifoldCF, so try to add support for HSQLDB --- Key: CONNECTORS-114 URL: https://issues.apache.org/jira/browse/CONNECTORS-114 Project: ManifoldCF Issue Type: Bug Components: Framework core Reporter: Karl Wright Fix For: ManifoldCF 0.3 Derby seems to have multiple problems: (1) It has internal deadlocks, which even if caught cause poor performance due to stalling (CONNECTORS-111); (2) It has no support for certain SQL constructs (CONNECTORS-109 and CONNECTORS-110); (3) It locks up entirely for some people (CONNECTORS-100). HSQLDB has been recommended as another potential embedded database that might work better. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CONNECTORS-114) Derby seems too unstable in multithreaded situations to be a good database for ManifoldCF, so try to add support for HSQLDB
[ https://issues.apache.org/jira/browse/CONNECTORS-114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-114. Resolution: Fixed Fix Version/s: ManifoldCF 0.3 Assignee: Karl Wright I have not yet made HSQLDB the official Derby replacement, but it is currently a better embedded option for many situations than Derby is. Derby seems too unstable in multithreaded situations to be a good database for ManifoldCF, so try to add support for HSQLDB --- Key: CONNECTORS-114 URL: https://issues.apache.org/jira/browse/CONNECTORS-114 Project: ManifoldCF Issue Type: Bug Components: Framework core Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 0.3 Derby seems to have multiple problems: (1) It has internal deadlocks, which even if caught cause poor performance due to stalling (CONNECTORS-111); (2) It has no support for certain SQL constructs (CONNECTORS-109 and CONNECTORS-110); (3) It locks up entirely for some people (CONNECTORS-100). HSQLDB has been recommended as another potential embedded database that might work better. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-206) HSQLDB is now a first-class ManifoldCF database; we should describe how to use it in the documentation
[ https://issues.apache.org/jira/browse/CONNECTORS-206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-206: --- Affects Version/s: ManifoldCF 0.3 HSQLDB is now a first-class ManifoldCF database; we should describe how to use it in the documentation -- Key: CONNECTORS-206 URL: https://issues.apache.org/jira/browse/CONNECTORS-206 Project: ManifoldCF Issue Type: Improvement Components: Documentation Affects Versions: ManifoldCF 0.3 Reporter: Karl Wright We're currently missing pretty much all mention of HSQLDB in the documentation. This includes how to enable it: org.apache.manifoldcf.databaseimplementationclass value org.apache.manifoldcf.core.database.DBInterfaceHSQLDB ... as well as the property it has for pointing at the database instance: org.apache.manifoldcf.hsqldbdatabasepath value relative path In addition to the site documentation for how to use it, we should also consider making HSQLDB be the default example database, since it seems to have fewer real problems than Derby. But this must wait until a test suite is written for this database. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CONNECTORS-207) ManifoldCFException type REPOSITORY_CONNECTION_ERROR causes a five-second retry, but should probably abort the job instead
ManifoldCFException type REPOSITORY_CONNECTION_ERROR causes a five-second retry, but should probably abort the job instead -- Key: CONNECTORS-207 URL: https://issues.apache.org/jira/browse/CONNECTORS-207 Project: ManifoldCF Issue Type: Bug Components: Framework crawler agent Affects Versions: ManifoldCF 0.2, ManifoldCF 0.1, ManifoldCF 0.3 Reporter: Karl Wright The way a worker thread treats ManifoldCFException type REPOSITORY_CONNECTION_ERROR is no longer correct. It should probably just allow the job to be aborted with no retries. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-207) ManifoldCFException type REPOSITORY_CONNECTION_ERROR causes a five-minute retry, but may want to abort the job instead
[ https://issues.apache.org/jira/browse/CONNECTORS-207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-207: --- Description: The way a worker thread treats ManifoldCFException type REPOSITORY_CONNECTION_ERROR is to wait 5 minutes and retry. It might want to just allow the job to be aborted with no retries. The current behavior is not actually *wrong*, but the circumstances under which it was added were the result of severe problems at various sites that were unrelated to ManifoldCF. was: The way a worker thread treats ManifoldCFException type REPOSITORY_CONNECTION_ERROR is no longer correct. It should probably just allow the job to be aborted with no retries. Priority: Minor (was: Major) Summary: ManifoldCFException type REPOSITORY_CONNECTION_ERROR causes a five-minute retry, but may want to abort the job instead (was: ManifoldCFException type REPOSITORY_CONNECTION_ERROR causes a five-second retry, but should probably abort the job instead) ManifoldCFException type REPOSITORY_CONNECTION_ERROR causes a five-minute retry, but may want to abort the job instead -- Key: CONNECTORS-207 URL: https://issues.apache.org/jira/browse/CONNECTORS-207 Project: ManifoldCF Issue Type: Bug Components: Framework crawler agent Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2, ManifoldCF 0.3 Reporter: Karl Wright Priority: Minor The way a worker thread treats ManifoldCFException type REPOSITORY_CONNECTION_ERROR is to wait 5 minutes and retry. It might want to just allow the job to be aborted with no retries. The current behavior is not actually *wrong*, but the circumstances under which it was added were the result of severe problems at various sites that were unrelated to ManifoldCF. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-204) Now that HSQLDB functions with ManifoldCF, write a test-hsqldb ant target to test it
[ https://issues.apache.org/jira/browse/CONNECTORS-204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13044007#comment-13044007 ] Karl Wright commented on CONNECTORS-204: r1131177 has part of the code. Now that HSQLDB functions with ManifoldCF, write a test-hsqldb ant target to test it Key: CONNECTORS-204 URL: https://issues.apache.org/jira/browse/CONNECTORS-204 Project: ManifoldCF Issue Type: Improvement Components: Build Reporter: Karl Wright Assignee: Karl Wright The latest HSQLDB fixes and features make it an attractive alternative to Derby. But we need a test target that exercises it. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira