[jira] [Commented] (CONNECTORS-1581) [Set priority thread] Error tossed: null during startup
[ https://issues.apache.org/jira/browse/CONNECTORS-1581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16765173#comment-16765173 ] Markus Schuch commented on CONNECTORS-1581: --- {code} select jobid, count(jobid), jobs.description from jobqueue left outer join jobs on jobs.id = jobqueue.jobid group by jobid; {code} actually returns 1 row with documents not having a existing job. These items belong to a job that we deleted some time ago. For some reason the cleanup did not work properly. [~kwri...@metacarta.com] Is it safe to delete those rows from the jobqeue table? > [Set priority thread] Error tossed: null during startup > --- > > Key: CONNECTORS-1581 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1581 > Project: ManifoldCF > Issue Type: Bug > Environment: •ManifoldCF 2.12, running in a Docker Container > based on Redhat Linux, OpenJDK 8 > • AWS RDS Database (Aurora MySQL -> 5.6 compatible, utf8 (collation > utf8_bin)) > • Single Process Setup >Reporter: Markus Schuch >Assignee: Markus Schuch >Priority: Major > > We see the following {{NullPointerException}} at startup: > {code} > [Set priority thread] FATAL org.apache.manifoldcf.crawlerthreads- Error > tossed: null > java.lang.NullPointerException > at > org.apache.manifoldcf.crawler.system.ManifoldCF.writeDocumentPriorities(ManifoldCF.java:1202) > at > org.apache.manifoldcf.crawler.system.SetPriorityThread.run(SetPriorityThread.java:141) > {code} > What could be the cause of that? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (CONNECTORS-1581) [Set priority thread] Error tossed: null during startup
[ https://issues.apache.org/jira/browse/CONNECTORS-1581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Schuch reassigned CONNECTORS-1581: - Assignee: Markus Schuch > [Set priority thread] Error tossed: null during startup > --- > > Key: CONNECTORS-1581 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1581 > Project: ManifoldCF > Issue Type: Bug > Environment: •ManifoldCF 2.12, running in a Docker Container > based on Redhat Linux, OpenJDK 8 > • AWS RDS Database (Aurora MySQL -> 5.6 compatible, utf8 (collation > utf8_bin)) > • Single Process Setup >Reporter: Markus Schuch >Assignee: Markus Schuch >Priority: Major > > We see the following {{NullPointerException}} at startup: > {code} > [Set priority thread] FATAL org.apache.manifoldcf.crawlerthreads- Error > tossed: null > java.lang.NullPointerException > at > org.apache.manifoldcf.crawler.system.ManifoldCF.writeDocumentPriorities(ManifoldCF.java:1202) > at > org.apache.manifoldcf.crawler.system.SetPriorityThread.run(SetPriorityThread.java:141) > {code} > What could be the cause of that? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CONNECTORS-1581) [Set priority thread] Error tossed: null during startup
[ https://issues.apache.org/jira/browse/CONNECTORS-1581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16765178#comment-16765178 ] Karl Wright commented on CONNECTORS-1581: - Yes if the job ID doesn't show up anywhere it's safe to delete. How did you wind up in that situation though? Karl On Mon, Feb 11, 2019 at 12:15 PM Markus Schuch (JIRA) > [Set priority thread] Error tossed: null during startup > --- > > Key: CONNECTORS-1581 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1581 > Project: ManifoldCF > Issue Type: Bug > Environment: •ManifoldCF 2.12, running in a Docker Container > based on Redhat Linux, OpenJDK 8 > • AWS RDS Database (Aurora MySQL -> 5.6 compatible, utf8 (collation > utf8_bin)) > • Single Process Setup >Reporter: Markus Schuch >Assignee: Markus Schuch >Priority: Major > > We see the following {{NullPointerException}} at startup: > {code} > [Set priority thread] FATAL org.apache.manifoldcf.crawlerthreads- Error > tossed: null > java.lang.NullPointerException > at > org.apache.manifoldcf.crawler.system.ManifoldCF.writeDocumentPriorities(ManifoldCF.java:1202) > at > org.apache.manifoldcf.crawler.system.SetPriorityThread.run(SetPriorityThread.java:141) > {code} > What could be the cause of that? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CONNECTORS-1581) [Set priority thread] Error tossed: null during startup
[ https://issues.apache.org/jira/browse/CONNECTORS-1581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16765186#comment-16765186 ] Markus Schuch commented on CONNECTORS-1581: --- {quote}How did you wind up in that situation though?{quote} I do not know. We definetly did not temper with the database manually. I will do some tests by removing some jobs within our setup to see if i can reprocuce this. > [Set priority thread] Error tossed: null during startup > --- > > Key: CONNECTORS-1581 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1581 > Project: ManifoldCF > Issue Type: Bug > Environment: •ManifoldCF 2.12, running in a Docker Container > based on Redhat Linux, OpenJDK 8 > • AWS RDS Database (Aurora MySQL -> 5.6 compatible, utf8 (collation > utf8_bin)) > • Single Process Setup >Reporter: Markus Schuch >Assignee: Markus Schuch >Priority: Major > > We see the following {{NullPointerException}} at startup: > {code} > [Set priority thread] FATAL org.apache.manifoldcf.crawlerthreads- Error > tossed: null > java.lang.NullPointerException > at > org.apache.manifoldcf.crawler.system.ManifoldCF.writeDocumentPriorities(ManifoldCF.java:1202) > at > org.apache.manifoldcf.crawler.system.SetPriorityThread.run(SetPriorityThread.java:141) > {code} > What could be the cause of that? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (CONNECTORS-1581) [Set priority thread] Error tossed: null during startup
[ https://issues.apache.org/jira/browse/CONNECTORS-1581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16765173#comment-16765173 ] Markus Schuch edited comment on CONNECTORS-1581 at 2/11/19 5:20 PM: {code} select jobid, count(jobid), jobs.description from jobqueue left outer join jobs on jobs.id = jobqueue.jobid group by jobid; {code} actually returns 1 row with documents not having a existing job. These items belong to a job that we deleted some time ago. For some reason the cleanup did not work properly. [~kwri...@metacarta.com] Is it safe to delete those rows from the jobqueue table? (There are definetly no ingested documents left in the output repository) was (Author: schuchm): {code} select jobid, count(jobid), jobs.description from jobqueue left outer join jobs on jobs.id = jobqueue.jobid group by jobid; {code} actually returns 1 row with documents not having a existing job. These items belong to a job that we deleted some time ago. For some reason the cleanup did not work properly. [~kwri...@metacarta.com] Is it safe to delete those rows from the jobqeue table? > [Set priority thread] Error tossed: null during startup > --- > > Key: CONNECTORS-1581 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1581 > Project: ManifoldCF > Issue Type: Bug > Environment: •ManifoldCF 2.12, running in a Docker Container > based on Redhat Linux, OpenJDK 8 > • AWS RDS Database (Aurora MySQL -> 5.6 compatible, utf8 (collation > utf8_bin)) > • Single Process Setup >Reporter: Markus Schuch >Assignee: Markus Schuch >Priority: Major > > We see the following {{NullPointerException}} at startup: > {code} > [Set priority thread] FATAL org.apache.manifoldcf.crawlerthreads- Error > tossed: null > java.lang.NullPointerException > at > org.apache.manifoldcf.crawler.system.ManifoldCF.writeDocumentPriorities(ManifoldCF.java:1202) > at > org.apache.manifoldcf.crawler.system.SetPriorityThread.run(SetPriorityThread.java:141) > {code} > What could be the cause of that? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (CONNECTORS-1581) [Set priority thread] Error tossed: null during startup
[ https://issues.apache.org/jira/browse/CONNECTORS-1581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16765186#comment-16765186 ] Markus Schuch edited comment on CONNECTORS-1581 at 2/11/19 5:23 PM: {quote}How did you wind up in that situation though?{quote} I do not know. We definetly did not temper with the database manually. I will do some tests by removing some jobs within our setup to see if i can reproduce this. was (Author: schuchm): {quote}How did you wind up in that situation though?{quote} I do not know. We definetly did not temper with the database manually. I will do some tests by removing some jobs within our setup to see if i can reprocuce this. > [Set priority thread] Error tossed: null during startup > --- > > Key: CONNECTORS-1581 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1581 > Project: ManifoldCF > Issue Type: Bug > Environment: •ManifoldCF 2.12, running in a Docker Container > based on Redhat Linux, OpenJDK 8 > • AWS RDS Database (Aurora MySQL -> 5.6 compatible, utf8 (collation > utf8_bin)) > • Single Process Setup >Reporter: Markus Schuch >Assignee: Markus Schuch >Priority: Major > > We see the following {{NullPointerException}} at startup: > {code} > [Set priority thread] FATAL org.apache.manifoldcf.crawlerthreads- Error > tossed: null > java.lang.NullPointerException > at > org.apache.manifoldcf.crawler.system.ManifoldCF.writeDocumentPriorities(ManifoldCF.java:1202) > at > org.apache.manifoldcf.crawler.system.SetPriorityThread.run(SetPriorityThread.java:141) > {code} > What could be the cause of that? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CONNECTORS-1583) ManifoldCF getting hung frequently
Pavithra Dhakshinamurthy created CONNECTORS-1583: Summary: ManifoldCF getting hung frequently Key: CONNECTORS-1583 URL: https://issues.apache.org/jira/browse/CONNECTORS-1583 Project: ManifoldCF Issue Type: Bug Affects Versions: ManifoldCF 2.9.1 Reporter: Pavithra Dhakshinamurthy Attachments: image-2019-02-12-11-59-52-131.png Hi Team, We are using Manifold 2.9.1 version for crawling the documents. The ManifoldCF server is getting hung very frequently due to which crawling is getting failed. While accessing the Manifold application, it's throwing 404 error, but we could see the process running at the background. !image-2019-02-12-11-59-52-131.png|thumbnail! Connectors used: Repository :Documentum Output : Elasticsearch Kindly help us in resolving this issue. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CONNECTORS-1583) ManifoldCF getting hung frequently
[ https://issues.apache.org/jira/browse/CONNECTORS-1583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16765760#comment-16765760 ] Karl Wright commented on CONNECTORS-1583: - How have you deployed ManifoldCF? What app server are you using? What deployment model (e.g. which example)? The ManifoldCF UI runs underneath an application server. It appears to me like that application server is either inaccessible or has been shut down. This is not a ManifoldCF problem. > ManifoldCF getting hung frequently > -- > > Key: CONNECTORS-1583 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1583 > Project: ManifoldCF > Issue Type: Bug >Affects Versions: ManifoldCF 2.9.1 >Reporter: Pavithra Dhakshinamurthy >Priority: Major > Attachments: image-2019-02-12-11-59-52-131.png > > > Hi Team, > We are using Manifold 2.9.1 version for crawling the documents. The > ManifoldCF server is getting hung very frequently due to which crawling is > getting failed. > While accessing the Manifold application, it's throwing 404 error, but we > could see the process running at the background. > !image-2019-02-12-11-59-52-131.png|thumbnail! > Connectors used: > Repository :Documentum > Output : Elasticsearch > Kindly help us in resolving this issue. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CONNECTORS-1583) ManifoldCF getting hung frequently
[ https://issues.apache.org/jira/browse/CONNECTORS-1583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-1583. - Resolution: Incomplete > ManifoldCF getting hung frequently > -- > > Key: CONNECTORS-1583 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1583 > Project: ManifoldCF > Issue Type: Bug >Affects Versions: ManifoldCF 2.9.1 >Reporter: Pavithra Dhakshinamurthy >Priority: Major > Attachments: image-2019-02-12-11-59-52-131.png > > > Hi Team, > We are using Manifold 2.9.1 version for crawling the documents. The > ManifoldCF server is getting hung very frequently due to which crawling is > getting failed. > While accessing the Manifold application, it's throwing 404 error, but we > could see the process running at the background. > !image-2019-02-12-11-59-52-131.png|thumbnail! > Connectors used: > Repository :Documentum > Output : Elasticsearch > Kindly help us in resolving this issue. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CONNECTORS-1580) Issues in documentum connector
[ https://issues.apache.org/jira/browse/CONNECTORS-1580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16765053#comment-16765053 ] Pavithra Dhakshinamurthy commented on CONNECTORS-1580: -- Thanks Karl, The documents which have already got indexed are getting processed but not getting updated to Elasticsearch while re-running the same job-*Working fine* The below issue exists in documentum connector.Please help us to fix this issue. 1)We have scheduled a job to run for every 15 mins and we have written a query in the addSeedDocuments method to get the document id.On each schedule of the job, query will return different set of records.All the document id's have been added in the below method. activities.addSeedDocument(documentIdentifier); How to reset the seeded documents for the each schedule of the same job? > Issues in documentum connector > -- > > Key: CONNECTORS-1580 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1580 > Project: ManifoldCF > Issue Type: Bug >Reporter: Pavithra Dhakshinamurthy >Priority: Blocker > Attachments: Job_Scheduling.png > > > Hi Team, > We are facing below issues in apache manifold documentum connector version > 2.9.1.kindly help us. > 1.During the first run of the job,documents are getting indexed to > ElasticSearch.If the same job is run after the completion,records are getting > seeded,processed but not updated to output connector.Once the document id is > indexed,same document id is not able to update it again in the same job. > > 2.We have scheduled incremental crawling for every 15 mins and document > count will vary for every 15 mins. But in seeding it is not resetting the > document count,once the job is completed.It's getting added to last scheduled > job count. >eg.1st schedule-10 documents > 2nd schedule-5 documents > In the 2nd scheduled of the job,the document count should be 5,but it is > having document count as 15. so it is keep on adding the dcouments id for > every schedule and it is processing -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CONNECTORS-1582) Unable to Crawl the Site Contents and Meta-Data
[ https://issues.apache.org/jira/browse/CONNECTORS-1582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-1582. - Resolution: Not A Problem > Unable to Crawl the Site Contents and Meta-Data > --- > > Key: CONNECTORS-1582 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1582 > Project: ManifoldCF > Issue Type: Bug >Reporter: Pavithra Dhakshinamurthy >Assignee: Karl Wright >Priority: Major > > Hi, > Currently I'm using the ManifoldCF(2.9.1) SharePoint version 2003. I'm unable > to crawl the site contents data. I have facing some issues, hard to figure > out to resolve. > can you please assist the same. > There is a method(CheckMatch) for validating ASCII value for site contests > but unable to understand the usage of validation. I'm getting error "no > matching rule" because of failing the rule of CheckMatch(). > Even-though i tried path type as Library, List, Site, Folder but unable to > crawl the site contents and meta data. while putting logger i can able to see > the list of site contents > Thanks, -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CONNECTORS-1582) Unable to Crawl the Site Contents and Meta-Data
[ https://issues.apache.org/jira/browse/CONNECTORS-1582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16765019#comment-16765019 ] Karl Wright commented on CONNECTORS-1582: - Hi [~Pavithrad], the problem is that you will need not just one rule, but a rule for sites, and a rule for libraries, and a rule for documents. So if the entity you need to decide whether it is included is a site, then you need a site rule, and the same for libraries or documents. And since you can't get to all document metadata without drilling down through sites and libraries, you need the rules for these in order to get to the metadata for each of these levels. The documentation is pretty clear about how these rules work, but I agree that the interface is complex to work with. > Unable to Crawl the Site Contents and Meta-Data > --- > > Key: CONNECTORS-1582 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1582 > Project: ManifoldCF > Issue Type: Bug >Reporter: Pavithra Dhakshinamurthy >Priority: Major > > Hi, > Currently I'm using the ManifoldCF(2.9.1) SharePoint version 2003. I'm unable > to crawl the site contents data. I have facing some issues, hard to figure > out to resolve. > can you please assist the same. > There is a method(CheckMatch) for validating ASCII value for site contests > but unable to understand the usage of validation. I'm getting error "no > matching rule" because of failing the rule of CheckMatch(). > Even-though i tried path type as Library, List, Site, Folder but unable to > crawl the site contents and meta data. while putting logger i can able to see > the list of site contents > Thanks, -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (CONNECTORS-1582) Unable to Crawl the Site Contents and Meta-Data
[ https://issues.apache.org/jira/browse/CONNECTORS-1582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright reassigned CONNECTORS-1582: --- Assignee: Karl Wright > Unable to Crawl the Site Contents and Meta-Data > --- > > Key: CONNECTORS-1582 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1582 > Project: ManifoldCF > Issue Type: Bug >Reporter: Pavithra Dhakshinamurthy >Assignee: Karl Wright >Priority: Major > > Hi, > Currently I'm using the ManifoldCF(2.9.1) SharePoint version 2003. I'm unable > to crawl the site contents data. I have facing some issues, hard to figure > out to resolve. > can you please assist the same. > There is a method(CheckMatch) for validating ASCII value for site contests > but unable to understand the usage of validation. I'm getting error "no > matching rule" because of failing the rule of CheckMatch(). > Even-though i tried path type as Library, List, Site, Folder but unable to > crawl the site contents and meta data. while putting logger i can able to see > the list of site contents > Thanks, -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CONNECTORS-1582) Unable to Crawl the Site Contents and Meta-Data
Pavithra Dhakshinamurthy created CONNECTORS-1582: Summary: Unable to Crawl the Site Contents and Meta-Data Key: CONNECTORS-1582 URL: https://issues.apache.org/jira/browse/CONNECTORS-1582 Project: ManifoldCF Issue Type: Bug Reporter: Pavithra Dhakshinamurthy Hi, Currently I'm using the ManifoldCF(2.9.1) SharePoint version 2003. I'm unable to crawl the site contents data. I have facing some issues, hard to figure out to resolve. can you please assist the same. There is a method(CheckMatch) for validating ASCII value for site contests but unable to understand the usage of validation. I'm getting error "no matching rule" because of failing the rule of CheckMatch(). Even-though i tried path type as Library, List, Site, Folder but unable to crawl the site contents and meta data. while putting logger i can able to see the list of site contents Thanks, -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CONNECTORS-1580) Issues in documentum connector
[ https://issues.apache.org/jira/browse/CONNECTORS-1580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16765083#comment-16765083 ] Karl Wright commented on CONNECTORS-1580: - So you modified the Documentum Connector to change what addSeedDocument returns? Did you change what getModel() returns? Did you change how the version string is calculated in processDocuments()? If you don't do that the framework will not detect changes and will not work properly. > Issues in documentum connector > -- > > Key: CONNECTORS-1580 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1580 > Project: ManifoldCF > Issue Type: Bug >Reporter: Pavithra Dhakshinamurthy >Priority: Blocker > Attachments: Job_Scheduling.png > > > Hi Team, > We are facing below issues in apache manifold documentum connector version > 2.9.1.kindly help us. > 1.During the first run of the job,documents are getting indexed to > ElasticSearch.If the same job is run after the completion,records are getting > seeded,processed but not updated to output connector.Once the document id is > indexed,same document id is not able to update it again in the same job. > > 2.We have scheduled incremental crawling for every 15 mins and document > count will vary for every 15 mins. But in seeding it is not resetting the > document count,once the job is completed.It's getting added to last scheduled > job count. >eg.1st schedule-10 documents > 2nd schedule-5 documents > In the 2nd scheduled of the job,the document count should be 5,but it is > having document count as 15. so it is keep on adding the dcouments id for > every schedule and it is processing -- This message was sent by Atlassian JIRA (v7.6.3#76005)