[jira] [Commented] (CONNECTORS-1563) SolrException: org.apache.tika.exception.ZeroByteFileException: InputStream must have > 0 bytes
[ https://issues.apache.org/jira/browse/CONNECTORS-1563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16772774#comment-16772774 ] Shinichiro Abe commented on CONNECTORS-1563: Hi Subasini, Did you try to use file system repository instead of web repository? I did work with the configuration that is close to settings above though. Also what did simple history complain to not work? > SolrException: org.apache.tika.exception.ZeroByteFileException: InputStream > must have > 0 bytes > --- > > Key: CONNECTORS-1563 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1563 > Project: ManifoldCF > Issue Type: Task > Components: Lucene/SOLR connector >Reporter: Sneha >Assignee: Karl Wright >Priority: Major > Attachments: Document simple history.docx, Manifold and Solr > settings_CustomField.docx, managed-schema, manifold settings.docx, > manifoldcf.log, path.png, schema.png, solr.log, solrconfig.xml > > > I am encountering this problem: > I have checked "Use the Extract Update Handler:" param then I am getting an > error on Solr i.e. null:org.apache.solr.common.SolrException: > org.apache.tika.exception.ZeroByteFileException: InputStream must have > 0 > bytes > If I ignore tika exception, my documents get indexed but dont have content > field on Solr. > I am using Solr 7.3.1 and manifoldCF 2.8.1 > I am using solr cell and hence not configured external tika extractor in > manifoldCF pipeline > Please help me with this problem > Thanks in advance -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CONNECTORS-1586) Create plugin for Solr 8.0.0 when available
Shinichiro Abe created CONNECTORS-1586: -- Summary: Create plugin for Solr 8.0.0 when available Key: CONNECTORS-1586 URL: https://issues.apache.org/jira/browse/CONNECTORS-1586 Project: ManifoldCF Issue Type: Task Reporter: Shinichiro Abe The plugin for Solr 8.0 release. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CONNECTORS-1533) Solr Connector is unable to ingest documents
[ https://issues.apache.org/jira/browse/CONNECTORS-1533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16624771#comment-16624771 ] Shinichiro Abe commented on CONNECTORS-1533: I could see that exception with zero-length documents ,too. But I don't know how to turn off. Please change into useExtractHandler=false, path:"/update/extract" -> "/update" and add Tika extractor to pipeline. on this changes, you can see "missing content stream" exception when posing documents. > Solr Connector is unable to ingest documents > > > Key: CONNECTORS-1533 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1533 > Project: ManifoldCF > Issue Type: Bug > Components: Lucene/SOLR connector >Affects Versions: ManifoldCF 2.11 >Reporter: Julien Massiera >Assignee: Karl Wright >Priority: Major > Fix For: ManifoldCF 2.11 > > Attachments: 2018-09-23-012800.png, CONNECTORS-1533.patch > > > The "r69acbd9 - Fix solr connector content deletion bug" has introduced > another bug : > It is now impossible to ingest documents into Solr 7.4.0, we obtain the > following error : Error from server at http://localhost:8983/solr/FileShare: > missing content stream > The fact is, the requestWriter.getContentWriter(request) object is equal to > null only on commit requests. So the new lines of code introduced by the fix, > which are based on the test of this object, result in a null > Collection streams object and so the update request is failing. > Concerned class : > org.apache.manifoldcf.agents.output.solr.ModifiedHttpSolrClient -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CONNECTORS-1533) Solr Connector is unable to ingest documents
[ https://issues.apache.org/jira/browse/CONNECTORS-1533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16624757#comment-16624757 ] Shinichiro Abe commented on CONNECTORS-1533: I set Solr Cloud up in standard way: {noformat} cd solr-7.4.0 ./bin/solr -c -f ./bin/solr create_collection -c collection1 ./bin/solr delete -c collection1 {noformat} > Solr Connector is unable to ingest documents > > > Key: CONNECTORS-1533 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1533 > Project: ManifoldCF > Issue Type: Bug > Components: Lucene/SOLR connector >Affects Versions: ManifoldCF 2.11 >Reporter: Julien Massiera >Assignee: Karl Wright >Priority: Major > Fix For: ManifoldCF 2.11 > > Attachments: 2018-09-23-012800.png, CONNECTORS-1533.patch > > > The "r69acbd9 - Fix solr connector content deletion bug" has introduced > another bug : > It is now impossible to ingest documents into Solr 7.4.0, we obtain the > following error : Error from server at http://localhost:8983/solr/FileShare: > missing content stream > The fact is, the requestWriter.getContentWriter(request) object is equal to > null only on commit requests. So the new lines of code introduced by the fix, > which are based on the test of this object, result in a null > Collection streams object and so the update request is failing. > Concerned class : > org.apache.manifoldcf.agents.output.solr.ModifiedHttpSolrClient -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (CONNECTORS-1533) Solr Connector is unable to ingest documents
[ https://issues.apache.org/jira/browse/CONNECTORS-1533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shinichiro Abe updated CONNECTORS-1533: --- Attachment: 2018-09-23-012800.png > Solr Connector is unable to ingest documents > > > Key: CONNECTORS-1533 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1533 > Project: ManifoldCF > Issue Type: Bug > Components: Lucene/SOLR connector >Affects Versions: ManifoldCF 2.11 >Reporter: Julien Massiera >Assignee: Karl Wright >Priority: Major > Fix For: ManifoldCF 2.11 > > Attachments: 2018-09-23-012800.png, CONNECTORS-1533.patch > > > The "r69acbd9 - Fix solr connector content deletion bug" has introduced > another bug : > It is now impossible to ingest documents into Solr 7.4.0, we obtain the > following error : Error from server at http://localhost:8983/solr/FileShare: > missing content stream > The fact is, the requestWriter.getContentWriter(request) object is equal to > null only on commit requests. So the new lines of code introduced by the fix, > which are based on the test of this object, result in a null > Collection streams object and so the update request is failing. > Concerned class : > org.apache.manifoldcf.agents.output.solr.ModifiedHttpSolrClient -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CONNECTORS-1533) Solr Connector is unable to ingest documents
[ https://issues.apache.org/jira/browse/CONNECTORS-1533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16624728#comment-16624728 ] Shinichiro Abe commented on CONNECTORS-1533: [~kwri...@metacarta.com], I did fresh checkout again 10 mins ago, buiid it and run with new Solr collection, it did not succeed for documents deletion. > Solr Connector is unable to ingest documents > > > Key: CONNECTORS-1533 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1533 > Project: ManifoldCF > Issue Type: Bug > Components: Lucene/SOLR connector >Affects Versions: ManifoldCF 2.11 >Reporter: Julien Massiera >Assignee: Karl Wright >Priority: Major > Fix For: ManifoldCF 2.11 > > Attachments: CONNECTORS-1533.patch > > > The "r69acbd9 - Fix solr connector content deletion bug" has introduced > another bug : > It is now impossible to ingest documents into Solr 7.4.0, we obtain the > following error : Error from server at http://localhost:8983/solr/FileShare: > missing content stream > The fact is, the requestWriter.getContentWriter(request) object is equal to > null only on commit requests. So the new lines of code introduced by the fix, > which are based on the test of this object, result in a null > Collection streams object and so the update request is failing. > Concerned class : > org.apache.manifoldcf.agents.output.solr.ModifiedHttpSolrClient -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (CONNECTORS-1533) Solr Connector is unable to ingest documents
[ https://issues.apache.org/jira/browse/CONNECTORS-1533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16624419#comment-16624419 ] Shinichiro Abe edited comment on CONNECTORS-1533 at 9/22/18 1:44 AM: - HI, I tested for RC artifact a few days ago, then I saw the "missing content stream" error in Solr instance when posting documents. I pulled latest trunk today, I still see the "missing content stream" error, when deletion documents at Job deletion in crawler-ui. ModifiedHttpClient was introduced in CONNECTORS-623. As far as I know there may still have a few impacts as to back compat, but I think that function will need to be changed or removed to work in latest Solr. was (Author: shinichiro abe): HI, I tested for RC artifact a few days ago, then I saw the "missing content stream" error in Solr instance when posting documents. I pulled latest trunk today, I still see the "missing content stream" error, when deletion documents at Job deletion in crawler-ui. ModifiedHttpClient was introduced in CONNECTORS-623. As far as I know there may still have a few impacts as to back compat, that function will need to be changed or removed to work in latest Solr. > Solr Connector is unable to ingest documents > > > Key: CONNECTORS-1533 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1533 > Project: ManifoldCF > Issue Type: Bug > Components: Lucene/SOLR connector >Affects Versions: ManifoldCF 2.11 >Reporter: Julien Massiera >Assignee: Karl Wright >Priority: Major > Fix For: ManifoldCF 2.11 > > > The "r69acbd9 - Fix solr connector content deletion bug" has introduced > another bug : > It is now impossible to ingest documents into Solr 7.4.0, we obtain the > following error : Error from server at http://localhost:8983/solr/FileShare: > missing content stream > The fact is, the requestWriter.getContentWriter(request) object is equal to > null only on commit requests. So the new lines of code introduced by the fix, > which are based on the test of this object, result in a null > Collection streams object and so the update request is failing. > Concerned class : > org.apache.manifoldcf.agents.output.solr.ModifiedHttpSolrClient -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CONNECTORS-1533) Solr Connector is unable to ingest documents
[ https://issues.apache.org/jira/browse/CONNECTORS-1533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16624419#comment-16624419 ] Shinichiro Abe commented on CONNECTORS-1533: HI, I tested for RC artifact a few days ago, then I saw the "missing content stream" error in Solr instance when posting documents. I pulled latest trunk today, I still see the "missing content stream" error, when deletion documents at Job deletion in crawler-ui. ModifiedHttpClient was introduced in CONNECTORS-623. As far as I know there may still have a few impacts as to back compat, that function will need to be changed or removed to work in latest Solr. > Solr Connector is unable to ingest documents > > > Key: CONNECTORS-1533 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1533 > Project: ManifoldCF > Issue Type: Bug > Components: Lucene/SOLR connector >Affects Versions: ManifoldCF 2.11 >Reporter: Julien Massiera >Assignee: Karl Wright >Priority: Major > Fix For: ManifoldCF 2.11 > > > The "r69acbd9 - Fix solr connector content deletion bug" has introduced > another bug : > It is now impossible to ingest documents into Solr 7.4.0, we obtain the > following error : Error from server at http://localhost:8983/solr/FileShare: > missing content stream > The fact is, the requestWriter.getContentWriter(request) object is equal to > null only on commit requests. So the new lines of code introduced by the fix, > which are based on the test of this object, result in a null > Collection streams object and so the update request is failing. > Concerned class : > org.apache.manifoldcf.agents.output.solr.ModifiedHttpSolrClient -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CONNECTORS-1503) UpdateProcessor SolrCloud and ManifoldCF
[ https://issues.apache.org/jira/browse/CONNECTORS-1503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16452238#comment-16452238 ] Shinichiro Abe commented on CONNECTORS-1503: Maybe we have to break Solr#add(doc) down to SolrRequest#process(SolrClient client, String collection) as well as contentStreamUpdateRequest.process( solrServer ) at useExtractUpdateHandler=true. {noformat} //response = solrServer.add( currentSolrDoc ); // <- current impl UpdateRequest req = new UpdateRequest(); req.setParams(params); // <- ModifiableSolrParams: params through writeField(...) req.add(currentSolrDoc); // <- SolrInputDocument req.setCommitWithin(commitWithinMs); response = req.process(solrServer, (String)null); // <- default collection {noformat} > UpdateProcessor SolrCloud and ManifoldCF > > > Key: CONNECTORS-1503 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1503 > Project: ManifoldCF > Issue Type: Bug > Components: Solr 6.x component >Affects Versions: ManifoldCF 2.9.1 > Environment: SolrCloud 6.6 > ManifoldCF 2.9.1 >Reporter: Maxence SAUNIER >Assignee: Shinichiro Abe >Priority: Major > Attachments: 20170421-1740.png, jira_update_processor.png, > manifoldcf_arguments_uniqFields.png, manifoldcf_output_conf.zip > > > Hello, > [Link to Apache mail > archive|http://mail-archives.apache.org/mod_mbox/manifoldcf-user/201804.mbox/%3C079e01d3d7da%24807b8f60%248172ae20%24%40citya.com%3E] > When we used Argument option in ManifoldCF for SolrCloud, ManifoldCF add they > arguments on the POST request and not on the url parameters. So, for add a > (pre)processor or a post-processor with the url, it's not possible. > [SolrConfig > updateRequestProcessorChain|https://lucene.apache.org/solr/guide/6_6/config-api.html#ConfigAPI-Whatabout_updateRequestProcessorChain_] > [call > UpdateRequestProcessors|https://lucene.apache.org/solr/guide/6_6/update-request-processors.html#UpdateRequestProcessors-Processor_Post-ProcessorRequestParameters] > [Conf image|https://image.ibb.co/cZC8bn/jira_update_processor.png] > Solr response: > org.apache.solr.common.SolrException: ERROR: > [doc=file:/srvics01/ways_holding/gestion_ged/gerance/3573/201102081135_ENVOIDEVISPP.doc] > unknown field 'processor' -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CONNECTORS-1503) UpdateProcessor SolrCloud and ManifoldCF
[ https://issues.apache.org/jira/browse/CONNECTORS-1503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16452143#comment-16452143 ] Shinichiro Abe commented on CONNECTORS-1503: Yes, you are right. it should be writeField(ModifiableSolrParams out, String fieldName, String fieldValue) I was using standard handler with curl-based postings, which is not using MCF' tika option. > UpdateProcessor SolrCloud and ManifoldCF > > > Key: CONNECTORS-1503 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1503 > Project: ManifoldCF > Issue Type: Bug > Components: Solr 6.x component >Affects Versions: ManifoldCF 2.9.1 > Environment: SolrCloud 6.6 > ManifoldCF 2.9.1 >Reporter: Maxence SAUNIER >Assignee: Shinichiro Abe >Priority: Major > Attachments: 20170421-1740.png, jira_update_processor.png, > manifoldcf_arguments_uniqFields.png, manifoldcf_output_conf.zip > > > Hello, > [Link to Apache mail > archive|http://mail-archives.apache.org/mod_mbox/manifoldcf-user/201804.mbox/%3C079e01d3d7da%24807b8f60%248172ae20%24%40citya.com%3E] > When we used Argument option in ManifoldCF for SolrCloud, ManifoldCF add they > arguments on the POST request and not on the url parameters. So, for add a > (pre)processor or a post-processor with the url, it's not possible. > [SolrConfig > updateRequestProcessorChain|https://lucene.apache.org/solr/guide/6_6/config-api.html#ConfigAPI-Whatabout_updateRequestProcessorChain_] > [call > UpdateRequestProcessors|https://lucene.apache.org/solr/guide/6_6/update-request-processors.html#UpdateRequestProcessors-Processor_Post-ProcessorRequestParameters] > [Conf image|https://image.ibb.co/cZC8bn/jira_update_processor.png] > Solr response: > org.apache.solr.common.SolrException: ERROR: > [doc=file:/srvics01/ways_holding/gestion_ged/gerance/3573/201102081135_ENVOIDEVISPP.doc] > unknown field 'processor' -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CONNECTORS-1503) UpdateProcessor SolrCloud and ManifoldCF
[ https://issues.apache.org/jira/browse/CONNECTORS-1503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16452080#comment-16452080 ] Shinichiro Abe commented on CONNECTORS-1503: in my env., standard handler with processor param works well, something is wrong in your env., imo. solr processor works in each node per doc, i do not think solr client have a ploblem. also, solr cell has solrcontenthandler that captures content body correctory, otoh mcf' tika extractor does not have it. thare is a difference. i want verbosed exeption stacktrace. > UpdateProcessor SolrCloud and ManifoldCF > > > Key: CONNECTORS-1503 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1503 > Project: ManifoldCF > Issue Type: Bug > Components: Solr 6.x component >Affects Versions: ManifoldCF 2.9.1 > Environment: SolrCloud 6.6 > ManifoldCF 2.9.1 >Reporter: Maxence SAUNIER >Assignee: Shinichiro Abe >Priority: Major > Attachments: 20170421-1740.png, jira_update_processor.png, > manifoldcf_arguments_uniqFields.png, manifoldcf_output_conf.zip > > > Hello, > [Link to Apache mail > archive|http://mail-archives.apache.org/mod_mbox/manifoldcf-user/201804.mbox/%3C079e01d3d7da%24807b8f60%248172ae20%24%40citya.com%3E] > When we used Argument option in ManifoldCF for SolrCloud, ManifoldCF add they > arguments on the POST request and not on the url parameters. So, for add a > (pre)processor or a post-processor with the url, it's not possible. > [SolrConfig > updateRequestProcessorChain|https://lucene.apache.org/solr/guide/6_6/config-api.html#ConfigAPI-Whatabout_updateRequestProcessorChain_] > [call > UpdateRequestProcessors|https://lucene.apache.org/solr/guide/6_6/update-request-processors.html#UpdateRequestProcessors-Processor_Post-ProcessorRequestParameters] > [Conf image|https://image.ibb.co/cZC8bn/jira_update_processor.png] > Solr response: > org.apache.solr.common.SolrException: ERROR: > [doc=file:/srvics01/ways_holding/gestion_ged/gerance/3573/201102081135_ENVOIDEVISPP.doc] > unknown field 'processor' -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CONNECTORS-1503) UpdateProcessor SolrCloud and ManifoldCF
[ https://issues.apache.org/jira/browse/CONNECTORS-1503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16447908#comment-16447908 ] Shinichiro Abe commented on CONNECTORS-1503: 1. the "unique fields" processor test with ManifoldCF itself works well. 2. curl-based HTTPPost request with "processor=" works with Solr. I don't have any exception with ManifoldCF/Solr. Btw, I'm using Solr 7.2 since ManifoldCF's SolrJ version is 7x. > UpdateProcessor SolrCloud and ManifoldCF > > > Key: CONNECTORS-1503 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1503 > Project: ManifoldCF > Issue Type: Bug > Components: Solr 6.x component >Affects Versions: ManifoldCF 2.9.1 > Environment: SolrCloud 6.6 > ManifoldCF 2.9.1 >Reporter: Maxence SAUNIER >Assignee: Shinichiro Abe >Priority: Major > Attachments: 20170421-1740.png, jira_update_processor.png, > manifoldcf_arguments_uniqFields.png > > > Hello, > [Link to Apache mail > archive|http://mail-archives.apache.org/mod_mbox/manifoldcf-user/201804.mbox/%3C079e01d3d7da%24807b8f60%248172ae20%24%40citya.com%3E] > When we used Argument option in ManifoldCF for SolrCloud, ManifoldCF add they > arguments on the POST request and not on the url parameters. So, for add a > (pre)processor or a post-processor with the url, it's not possible. > [SolrConfig > updateRequestProcessorChain|https://lucene.apache.org/solr/guide/6_6/config-api.html#ConfigAPI-Whatabout_updateRequestProcessorChain_] > [call > UpdateRequestProcessors|https://lucene.apache.org/solr/guide/6_6/update-request-processors.html#UpdateRequestProcessors-Processor_Post-ProcessorRequestParameters] > [Conf image|https://image.ibb.co/cZC8bn/jira_update_processor.png] > Solr response: > org.apache.solr.common.SolrException: ERROR: > [doc=file:/srvics01/ways_holding/gestion_ged/gerance/3573/201102081135_ENVOIDEVISPP.doc] > unknown field 'processor' -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CONNECTORS-1503) UpdateProcessor SolrCloud and ManifoldCF
[ https://issues.apache.org/jira/browse/CONNECTORS-1503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16446681#comment-16446681 ] Shinichiro Abe commented on CONNECTORS-1503: attached my config, in that case, f1_ss field value is "a", which is unique value through Solr' URP. But when I remove that processor argument, f1_ss field value is ["a", "a"], it has two values. As far as I know, that URP works via HttpPoster. > UpdateProcessor SolrCloud and ManifoldCF > > > Key: CONNECTORS-1503 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1503 > Project: ManifoldCF > Issue Type: Bug > Components: Solr 6.x component >Affects Versions: ManifoldCF 2.9.1 > Environment: SolrCloud 6.6 > ManifoldCF 2.9.1 >Reporter: Maxence SAUNIER >Assignee: Shinichiro Abe >Priority: Major > Attachments: 20170421-1740.png, jira_update_processor.png > > > Hello, > [Link to Apache mail > archive|http://mail-archives.apache.org/mod_mbox/manifoldcf-user/201804.mbox/%3C079e01d3d7da%24807b8f60%248172ae20%24%40citya.com%3E] > When we used Argument option in ManifoldCF for SolrCloud, ManifoldCF add they > arguments on the POST request and not on the url parameters. So, for add a > (pre)processor or a post-processor with the url, it's not possible. > [SolrConfig > updateRequestProcessorChain|https://lucene.apache.org/solr/guide/6_6/config-api.html#ConfigAPI-Whatabout_updateRequestProcessorChain_] > [call > UpdateRequestProcessors|https://lucene.apache.org/solr/guide/6_6/update-request-processors.html#UpdateRequestProcessors-Processor_Post-ProcessorRequestParameters] > [Conf image|https://image.ibb.co/cZC8bn/jira_update_processor.png] > Solr response: > org.apache.solr.common.SolrException: ERROR: > [doc=file:/srvics01/ways_holding/gestion_ged/gerance/3573/201102081135_ENVOIDEVISPP.doc] > unknown field 'processor' -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (CONNECTORS-1503) UpdateProcessor SolrCloud and ManifoldCF
[ https://issues.apache.org/jira/browse/CONNECTORS-1503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shinichiro Abe updated CONNECTORS-1503: --- Attachment: 20170421-1740.png > UpdateProcessor SolrCloud and ManifoldCF > > > Key: CONNECTORS-1503 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1503 > Project: ManifoldCF > Issue Type: Bug > Components: Solr 6.x component >Affects Versions: ManifoldCF 2.9.1 > Environment: SolrCloud 6.6 > ManifoldCF 2.9.1 >Reporter: Maxence SAUNIER >Assignee: Shinichiro Abe >Priority: Major > Attachments: 20170421-1740.png, jira_update_processor.png > > > Hello, > [Link to Apache mail > archive|http://mail-archives.apache.org/mod_mbox/manifoldcf-user/201804.mbox/%3C079e01d3d7da%24807b8f60%248172ae20%24%40citya.com%3E] > When we used Argument option in ManifoldCF for SolrCloud, ManifoldCF add they > arguments on the POST request and not on the url parameters. So, for add a > (pre)processor or a post-processor with the url, it's not possible. > [SolrConfig > updateRequestProcessorChain|https://lucene.apache.org/solr/guide/6_6/config-api.html#ConfigAPI-Whatabout_updateRequestProcessorChain_] > [call > UpdateRequestProcessors|https://lucene.apache.org/solr/guide/6_6/update-request-processors.html#UpdateRequestProcessors-Processor_Post-ProcessorRequestParameters] > [Conf image|https://image.ibb.co/cZC8bn/jira_update_processor.png] > Solr response: > org.apache.solr.common.SolrException: ERROR: > [doc=file:/srvics01/ways_holding/gestion_ged/gerance/3573/201102081135_ENVOIDEVISPP.doc] > unknown field 'processor' -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CONNECTORS-1503) UpdateProcessor SolrCloud and ManifoldCF
[ https://issues.apache.org/jira/browse/CONNECTORS-1503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16444622#comment-16444622 ] Shinichiro Abe commented on CONNECTORS-1503: I could not reproduce it. {noformat} curl http://localhost:8983/solr/collection1/config -d '{ "add-updateprocessor" : { "name": "uniqFields", "class":"solr.UniqFieldsUpdateProcessorFactory", "fieldName":"content" } }' {noformat} And put processor=uniqFields on Auguments tab in MCF. I didn't get any Solr error response, it seems to work. it doesn't matter for methods, GET or POST. > UpdateProcessor SolrCloud and ManifoldCF > > > Key: CONNECTORS-1503 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1503 > Project: ManifoldCF > Issue Type: Bug > Components: Solr 6.x component >Affects Versions: ManifoldCF 2.9.1 > Environment: SolrCloud 6.6 > ManifoldCF 2.9.1 >Reporter: Maxence SAUNIER >Assignee: Shinichiro Abe >Priority: Major > > Hello, > [Link to Apache mail > archive|http://mail-archives.apache.org/mod_mbox/manifoldcf-user/201804.mbox/%3C079e01d3d7da%24807b8f60%248172ae20%24%40citya.com%3E] > When we used Argument option in ManifoldCF for SolrCloud, ManifoldCF add they > arguments on the POST request and not on the url parameters. So, for add a > (pre)processor or a post-processor with the url, it's not possible. > [SolrConfig > updateRequestProcessorChain|https://lucene.apache.org/solr/guide/6_6/config-api.html#ConfigAPI-Whatabout_updateRequestProcessorChain_] > [call > UpdateRequestProcessors|https://lucene.apache.org/solr/guide/6_6/update-request-processors.html#UpdateRequestProcessors-Processor_Post-ProcessorRequestParameters] > [Conf image|https://image.ibb.co/cZC8bn/jira_update_processor.png] > Solr response: > org.apache.solr.common.SolrException: ERROR: > [doc=file:/srvics01/ways_holding/gestion_ged/gerance/3573/201102081135_ENVOIDEVISPP.doc] > unknown field 'processor' -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CONNECTORS-1496) Solr plugin test fails when upgrading to 7.2.1
[ https://issues.apache.org/jira/browse/CONNECTORS-1496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shinichiro Abe resolved CONNECTORS-1496. Resolution: Fixed > Solr plugin test fails when upgrading to 7.2.1 > -- > > Key: CONNECTORS-1496 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1496 > Project: ManifoldCF > Issue Type: Bug > Components: Solr 7.x component >Reporter: Shinichiro Abe >Assignee: Shinichiro Abe >Priority: Trivial > Attachments: CONNECTORS-1496.patch > > > The log4j dependency in test-framework was introduced at SOLR-10628. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CONNECTORS-1253) Add QuickStart application
[ https://issues.apache.org/jira/browse/CONNECTORS-1253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shinichiro Abe resolved CONNECTORS-1253. Resolution: Won't Fix Feel free to revisit if volunteers step forward. > Add QuickStart application > -- > > Key: CONNECTORS-1253 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1253 > Project: ManifoldCF > Issue Type: Improvement > Components: Build >Affects Versions: ManifoldCF 2.2 >Reporter: Shinichiro Abe >Assignee: Shinichiro Abe >Priority: Major > Fix For: ManifoldCF next > > Attachments: CONNECTORS-1253.patch, CONNECTORS-1253.patch > > > A simple application packaged by maven in case developers do not use ant > build. Similar to maven exec. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CONNECTORS-1219) Lucene Output Connector
[ https://issues.apache.org/jira/browse/CONNECTORS-1219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shinichiro Abe resolved CONNECTORS-1219. Resolution: Won't Fix Feel free to revisit if volunteers step forward. > Lucene Output Connector > --- > > Key: CONNECTORS-1219 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1219 > Project: ManifoldCF > Issue Type: New Feature >Reporter: Shinichiro Abe >Assignee: Shinichiro Abe >Priority: Major > Attachments: CONNECTORS-1219-v0.1patch.patch, > CONNECTORS-1219-v0.2.patch, CONNECTORS-1219-v0.3.patch > > > A output connector for Lucene local index directly, not via remote search > engine. It would be nice if we could use Lucene various API to the index > directly, even though we could do the same thing to the Solr or Elasticsearch > index. I assume we can do something to classification, categorization, and > tagging, using e.g lucene-classification package. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CONNECTORS-1443) Create plugin for Solr 7.0.0 when available
[ https://issues.apache.org/jira/browse/CONNECTORS-1443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shinichiro Abe resolved CONNECTORS-1443. Resolution: Fixed released on 2017 Sep 24. > Create plugin for Solr 7.0.0 when available > --- > > Key: CONNECTORS-1443 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1443 > Project: ManifoldCF > Issue Type: Task > Components: Lucene/SOLR connector, Solr 7.x component >Reporter: Shinichiro Abe >Priority: Major > Attachments: CONNECTORS-1443-plugin.v1.patch, > CONNECTORS-1443-plugin.v2.patch > > > The plugin and connector for Solr 7.0 release. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CONNECTORS-1496) Solr plugin test fails when upgrading to 7.2.1
[ https://issues.apache.org/jira/browse/CONNECTORS-1496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16376152#comment-16376152 ] Shinichiro Abe commented on CONNECTORS-1496: Also, I did Github mirror request at INFRA-16101. > Solr plugin test fails when upgrading to 7.2.1 > -- > > Key: CONNECTORS-1496 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1496 > Project: ManifoldCF > Issue Type: Bug > Components: Solr 7.x component >Reporter: Shinichiro Abe >Assignee: Shinichiro Abe >Priority: Trivial > Attachments: CONNECTORS-1496.patch > > > The log4j dependency in test-framework was introduced at SOLR-10628. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CONNECTORS-1496) Solr plugin test fails when upgrading to 7.2.1
[ https://issues.apache.org/jira/browse/CONNECTORS-1496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16376151#comment-16376151 ] Shinichiro Abe commented on CONNECTORS-1496: r1825314. I committed for Solr 7.2.1 upgrading. Thank you. > Solr plugin test fails when upgrading to 7.2.1 > -- > > Key: CONNECTORS-1496 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1496 > Project: ManifoldCF > Issue Type: Bug > Components: Solr 7.x component >Reporter: Shinichiro Abe >Assignee: Shinichiro Abe >Priority: Trivial > Attachments: CONNECTORS-1496.patch > > > The log4j dependency in test-framework was introduced at SOLR-10628. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (CONNECTORS-1496) Solr plugin test fails when upgrading to 7.2.1
[ https://issues.apache.org/jira/browse/CONNECTORS-1496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shinichiro Abe updated CONNECTORS-1496: --- Attachment: CONNECTORS-1496.patch > Solr plugin test fails when upgrading to 7.2.1 > -- > > Key: CONNECTORS-1496 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1496 > Project: ManifoldCF > Issue Type: Bug > Components: Solr 7.x component >Reporter: Shinichiro Abe >Priority: Trivial > Attachments: CONNECTORS-1496.patch > > > The log4j dependency in test-framework was introduced at SOLR-10628. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CONNECTORS-1496) Solr plugin test fails when upgrading to 7.2.1
Shinichiro Abe created CONNECTORS-1496: -- Summary: Solr plugin test fails when upgrading to 7.2.1 Key: CONNECTORS-1496 URL: https://issues.apache.org/jira/browse/CONNECTORS-1496 Project: ManifoldCF Issue Type: Bug Components: Solr 7.x component Reporter: Shinichiro Abe The log4j dependency in test-framework was introduced at SOLR-10628. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CONNECTORS-1490) GSOC: MongoDB Output Connector
[ https://issues.apache.org/jira/browse/CONNECTORS-1490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16362991#comment-16362991 ] Shinichiro Abe commented on CONNECTORS-1490: Hi, You would to need to impl mongo output connector extending [OutputConnector|https://github.com/apache/manifoldcf/blob/trunk/framework/agents/src/main/java/org/apache/manifoldcf/agents/output/BaseOutputConnector.java]. > GSOC: MongoDB Output Connector > -- > > Key: CONNECTORS-1490 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1490 > Project: ManifoldCF > Issue Type: New Feature > Components: MongoDB Output Connector >Reporter: Piergiorgio Lucidi >Assignee: Piergiorgio Lucidi >Priority: Major > Labels: MongoDB, gsoc2018, java, junit > Original Estimate: 480h > Remaining Estimate: 480h > > This is a project idea for [Google Summer of > Code|https://summerofcode.withgoogle.com/] (GSOC). > To discuss this or other ideas with your potential mentor from the Apache > ManifoldCF project, sign up and post to the dev@manifoldcf.apache.org list, > including "[GSOC]" in the subject. You may also comment on this Jira issue if > you have created an account. > We would like to extend the Content Migration capabilities adding MongoDB / > GridFS as a new output connector for importing contents from one or more > repositories supported by ManifoldCF. In this way we will help developers on > migrating contents from different data sources on MongoDB. > You will be involved in the development of the following tasks, you will > learn how to: > * Write the connector implementation > * Implement unit tests > * Build all the integration tests for testing the connector inside the > framework > * Write the documentation for this connector > We have a complete documentation on how to implement an Output Connector: > [https://manifoldcf.apache.org/release/release-2.9.1/en_US/writing-output-connectors.html] > Take a look also at our book to understand better the framework and how to > implement connectors: > [https://github.com/DaddyWri/manifoldcfinaction/tree/master/pdfs] > > Prospective GSOC mentor: > [piergior...@apache.org|mailto:piergior...@apache.org] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CONNECTORS-1443) Upgrade to SolrJ 7.0.0 when available
[ https://issues.apache.org/jira/browse/CONNECTORS-1443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16173985#comment-16173985 ] Shinichiro Abe commented on CONNECTORS-1443: Committed r1809102. > Upgrade to SolrJ 7.0.0 when available > - > > Key: CONNECTORS-1443 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1443 > Project: ManifoldCF > Issue Type: Task > Components: Lucene/SOLR connector, Solr 7.x component >Reporter: Shinichiro Abe > Attachments: CONNECTORS-1443-plugin.v1.patch, > CONNECTORS-1443-plugin.v2.patch > > > The plugin and connector for Solr 7.0 release. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (CONNECTORS-1443) Upgrade to SolrJ 7.0.0 when available
[ https://issues.apache.org/jira/browse/CONNECTORS-1443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shinichiro Abe updated CONNECTORS-1443: --- Attachment: CONNECTORS-1443-plugin.v2.patch plugin patch for solr-7.x repository I created now. > Upgrade to SolrJ 7.0.0 when available > - > > Key: CONNECTORS-1443 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1443 > Project: ManifoldCF > Issue Type: Task > Components: Lucene/SOLR connector, Solr 7.x component >Reporter: Shinichiro Abe > Attachments: CONNECTORS-1443-plugin.v1.patch, > CONNECTORS-1443-plugin.v2.patch > > > The plugin and connector for Solr 7.0 release. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (CONNECTORS-1443) Upgrade to SolrJ 7.0.0 when available
[ https://issues.apache.org/jira/browse/CONNECTORS-1443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shinichiro Abe updated CONNECTORS-1443: --- Attachment: CONNECTORS-1443-plugin.v1.patch plugin diff patch based on solr-6.x. > Upgrade to SolrJ 7.0.0 when available > - > > Key: CONNECTORS-1443 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1443 > Project: ManifoldCF > Issue Type: Task > Components: Lucene/SOLR connector, Solr 7.x component >Reporter: Shinichiro Abe > Attachments: CONNECTORS-1443-plugin.v1.patch > > > The plugin and connector for Solr 7.0 release. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (CONNECTORS-1443) Upgrade to SolrJ 7.0.0 when available
Shinichiro Abe created CONNECTORS-1443: -- Summary: Upgrade to SolrJ 7.0.0 when available Key: CONNECTORS-1443 URL: https://issues.apache.org/jira/browse/CONNECTORS-1443 Project: ManifoldCF Issue Type: Task Components: Solr 7.x component, Lucene/SOLR connector Reporter: Shinichiro Abe The plugin and connector for Solr 7.0 release. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (CONNECTORS-1427) Remove tests for Tika service connector
[ https://issues.apache.org/jira/browse/CONNECTORS-1427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16016736#comment-16016736 ] Shinichiro Abe edited comment on CONNECTORS-1427 at 5/19/17 12:47 AM: -- we can also remove src/test/resources/test-documents' resources when writing test code of tika service connector not to use it. was (Author: shinichiro abe): we can also remove src/test/resources/test-documents' resources when writing test code of tika service connector using it. > Remove tests for Tika service connector > --- > > Key: CONNECTORS-1427 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1427 > Project: ManifoldCF > Issue Type: Task > Components: Tika extractor >Affects Versions: ManifoldCF 2.7 >Reporter: Julien Massiera >Assignee: Karl Wright >Priority: Minor > Fix For: ManifoldCF 2.8 > > > The new Tika service connector is using the test classes of the Tika > extractor connector which are obviously not adapted and therefore trigger > errors during mvn install > N.B : Need to create a "Tika service component" in JIRA -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CONNECTORS-1427) Remove tests for Tika service connector
[ https://issues.apache.org/jira/browse/CONNECTORS-1427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16016736#comment-16016736 ] Shinichiro Abe commented on CONNECTORS-1427: we can also remove src/test/resources/test-documents' resources when writing test code of tika service connector using it. > Remove tests for Tika service connector > --- > > Key: CONNECTORS-1427 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1427 > Project: ManifoldCF > Issue Type: Task > Components: Tika extractor >Affects Versions: ManifoldCF 2.7 >Reporter: Julien Massiera >Assignee: Karl Wright >Priority: Minor > Fix For: ManifoldCF 2.8 > > > The new Tika service connector is using the test classes of the Tika > extractor connector which are obviously not adapted and therefore trigger > errors during mvn install > N.B : Need to create a "Tika service component" in JIRA -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CONNECTORS-1238) Update MCF's lucene/solr connector to use the most recent solr jar etc.
[ https://issues.apache.org/jira/browse/CONNECTORS-1238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15715003#comment-15715003 ] Shinichiro Abe commented on CONNECTORS-1238: Yes, I did. > Update MCF's lucene/solr connector to use the most recent solr jar etc. > --- > > Key: CONNECTORS-1238 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1238 > Project: ManifoldCF > Issue Type: Task > Components: Build, Lucene/SOLR connector >Affects Versions: ManifoldCF 2.3 >Reporter: Karl Wright >Assignee: Karl Wright > Fix For: ManifoldCF 2.6 > > > Solr is up to revision 5.3, and 5.3.1 is also imminent. We should bring MCF > up to date, especially since there's now a new "security module" which > provides authentication of various kinds. > It's possible that this will also require updates to httpclient. Some effort > has been made to make Kerberos be more usable in the client, although I don't > know where that effort stands. It may not be available even yet. In general > we try to upgrade to whatever version Solr is happy with, since Solr is so > dependent on precise versions of httpclient. > [~alessandro.benedetti]], perhaps you might have some experience with this? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CONNECTORS-1338) Upgrade to SolrJ 6.3.0
[ https://issues.apache.org/jira/browse/CONNECTORS-1338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15653202#comment-15653202 ] Shinichiro Abe commented on CONNECTORS-1338: No, we don't need to release the plugin, its updates are just for testing against component/queryparser interface. > Upgrade to SolrJ 6.3.0 > -- > > Key: CONNECTORS-1338 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1338 > Project: ManifoldCF > Issue Type: Task > Components: Lucene/SOLR connector >Affects Versions: ManifoldCF 2.6 >Reporter: Shinichiro Abe >Assignee: Shinichiro Abe > Fix For: ManifoldCF 2.6 > > Attachments: CONNECTORS-1338-integration.patch, > CONNECTORS-1338-integration.patch, CONNECTORS-1338.patch, > CONNECTORS-1338.patch, CONNECTORS-1338.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (CONNECTORS-1338) Upgrade to SolrJ 6.3.0
[ https://issues.apache.org/jira/browse/CONNECTORS-1338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shinichiro Abe resolved CONNECTORS-1338. Resolution: Fixed Fix Version/s: ManifoldCF 2.6 Committed r1769036 for trunk, r1769037 for solr-plugin. > Upgrade to SolrJ 6.3.0 > -- > > Key: CONNECTORS-1338 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1338 > Project: ManifoldCF > Issue Type: Task > Components: Lucene/SOLR connector >Affects Versions: ManifoldCF 2.6 >Reporter: Shinichiro Abe >Assignee: Shinichiro Abe > Fix For: ManifoldCF 2.6 > > Attachments: CONNECTORS-1338-integration.patch, > CONNECTORS-1338-integration.patch, CONNECTORS-1338.patch, > CONNECTORS-1338.patch, CONNECTORS-1338.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CONNECTORS-1338) Upgrade to SolrJ 6.3.0
[ https://issues.apache.org/jira/browse/CONNECTORS-1338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shinichiro Abe updated CONNECTORS-1338: --- Attachment: CONNECTORS-1338-integration.patch > Upgrade to SolrJ 6.3.0 > -- > > Key: CONNECTORS-1338 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1338 > Project: ManifoldCF > Issue Type: Task > Components: Lucene/SOLR connector >Affects Versions: ManifoldCF 2.6 >Reporter: Shinichiro Abe >Assignee: Shinichiro Abe > Attachments: CONNECTORS-1338-integration.patch, > CONNECTORS-1338-integration.patch, CONNECTORS-1338.patch, > CONNECTORS-1338.patch, CONNECTORS-1338.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CONNECTORS-1338) Upgrade to SolrJ 6.3.0
[ https://issues.apache.org/jira/browse/CONNECTORS-1338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shinichiro Abe updated CONNECTORS-1338: --- Attachment: CONNECTORS-1338.patch > Upgrade to SolrJ 6.3.0 > -- > > Key: CONNECTORS-1338 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1338 > Project: ManifoldCF > Issue Type: Task > Components: Lucene/SOLR connector >Affects Versions: ManifoldCF 2.6 >Reporter: Shinichiro Abe >Assignee: Shinichiro Abe > Attachments: CONNECTORS-1338-integration.patch, > CONNECTORS-1338-integration.patch, CONNECTORS-1338.patch, > CONNECTORS-1338.patch, CONNECTORS-1338.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CONNECTORS-1338) Upgrade to SolrJ 6.3.0
[ https://issues.apache.org/jira/browse/CONNECTORS-1338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15653092#comment-15653092 ] Shinichiro Abe commented on CONNECTORS-1338: Solr 6.3.0 has released this week. Testing with attached patch has passed. I'll commit it soon. > Upgrade to SolrJ 6.3.0 > -- > > Key: CONNECTORS-1338 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1338 > Project: ManifoldCF > Issue Type: Task > Components: Lucene/SOLR connector >Affects Versions: ManifoldCF 2.6 >Reporter: Shinichiro Abe >Assignee: Shinichiro Abe > Attachments: CONNECTORS-1338-integration.patch, > CONNECTORS-1338.patch, CONNECTORS-1338.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CONNECTORS-1338) Upgrade to SolrJ 6.3.0
[ https://issues.apache.org/jira/browse/CONNECTORS-1338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shinichiro Abe updated CONNECTORS-1338: --- Summary: Upgrade to SolrJ 6.3.0 (was: Upgrade to SolrJ 6.2.1) > Upgrade to SolrJ 6.3.0 > -- > > Key: CONNECTORS-1338 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1338 > Project: ManifoldCF > Issue Type: Task > Components: Lucene/SOLR connector >Affects Versions: ManifoldCF 2.6 >Reporter: Shinichiro Abe >Assignee: Shinichiro Abe > Attachments: CONNECTORS-1338-integration.patch, > CONNECTORS-1338.patch, CONNECTORS-1338.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CONNECTORS-1338) Upgrade to SolrJ 6.2.1
[ https://issues.apache.org/jira/browse/CONNECTORS-1338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15546007#comment-15546007 ] Shinichiro Abe commented on CONNECTORS-1338: Jackson and Guava dependencies are removed in SOLR-9588 and SOLR-9589. No new SolrJ dependencies now. I'll update that SolrJ version next Solr' release. > Upgrade to SolrJ 6.2.1 > -- > > Key: CONNECTORS-1338 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1338 > Project: ManifoldCF > Issue Type: Task > Components: Lucene/SOLR connector >Affects Versions: ManifoldCF 2.6 >Reporter: Shinichiro Abe >Assignee: Shinichiro Abe > Attachments: CONNECTORS-1338-integration.patch, > CONNECTORS-1338.patch, CONNECTORS-1338.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CONNECTORS-1338) Upgrade to SolrJ 6.2.1
[ https://issues.apache.org/jira/browse/CONNECTORS-1338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15511033#comment-15511033 ] Shinichiro Abe commented on CONNECTORS-1338: I postpone committing. Jackson version seems to be incorrect on Solr, which is being discussed in SOLR-9542. > Upgrade to SolrJ 6.2.1 > -- > > Key: CONNECTORS-1338 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1338 > Project: ManifoldCF > Issue Type: Task > Components: Lucene/SOLR connector >Affects Versions: ManifoldCF 2.6 >Reporter: Shinichiro Abe >Assignee: Shinichiro Abe > Attachments: CONNECTORS-1338-integration.patch, > CONNECTORS-1338.patch, CONNECTORS-1338.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CONNECTORS-1338) Upgrade to SolrJ 6.2.1
[ https://issues.apache.org/jira/browse/CONNECTORS-1338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15509588#comment-15509588 ] Shinichiro Abe commented on CONNECTORS-1338: Thank you for the reply. jackson(SOLR-9200) and guava(for test annotation) have been added to SolrJ 6.x dependencies. Current HttpPoster is not affect these deps, but I'll look into it to make sure MCF has the right version. > Upgrade to SolrJ 6.2.1 > -- > > Key: CONNECTORS-1338 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1338 > Project: ManifoldCF > Issue Type: Task > Components: Lucene/SOLR connector >Affects Versions: ManifoldCF 2.6 >Reporter: Shinichiro Abe >Assignee: Shinichiro Abe > Attachments: CONNECTORS-1338-integration.patch, > CONNECTORS-1338.patch, CONNECTORS-1338.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (CONNECTORS-1338) Upgrade to SolrJ 6.2.1
[ https://issues.apache.org/jira/browse/CONNECTORS-1338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shinichiro Abe reassigned CONNECTORS-1338: -- Assignee: Shinichiro Abe > Upgrade to SolrJ 6.2.1 > -- > > Key: CONNECTORS-1338 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1338 > Project: ManifoldCF > Issue Type: Task > Components: Lucene/SOLR connector >Affects Versions: ManifoldCF 2.6 >Reporter: Shinichiro Abe >Assignee: Shinichiro Abe > Attachments: CONNECTORS-1338-integration.patch, > CONNECTORS-1338.patch, CONNECTORS-1338.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CONNECTORS-1338) Upgrade to SolrJ 6.2.1
[ https://issues.apache.org/jira/browse/CONNECTORS-1338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15509394#comment-15509394 ] Shinichiro Abe commented on CONNECTORS-1338: Any objection if I commit those? I'll commit those tomorrow. > Upgrade to SolrJ 6.2.1 > -- > > Key: CONNECTORS-1338 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1338 > Project: ManifoldCF > Issue Type: Task > Components: Lucene/SOLR connector >Affects Versions: ManifoldCF 2.6 >Reporter: Shinichiro Abe > Attachments: CONNECTORS-1338-integration.patch, > CONNECTORS-1338.patch, CONNECTORS-1338.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CONNECTORS-1338) Upgrade to SolrJ 6.2.1
[ https://issues.apache.org/jira/browse/CONNECTORS-1338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shinichiro Abe updated CONNECTORS-1338: --- Attachment: CONNECTORS-1338-integration.patch a patch for solr-plugin. > Upgrade to SolrJ 6.2.1 > -- > > Key: CONNECTORS-1338 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1338 > Project: ManifoldCF > Issue Type: Task > Components: Lucene/SOLR connector >Affects Versions: ManifoldCF 2.6 >Reporter: Shinichiro Abe > Attachments: CONNECTORS-1338-integration.patch, > CONNECTORS-1338.patch, CONNECTORS-1338.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CONNECTORS-1338) Upgrade to SolrJ 6.2.1
[ https://issues.apache.org/jira/browse/CONNECTORS-1338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shinichiro Abe updated CONNECTORS-1338: --- Attachment: CONNECTORS-1338.patch updated patch. testing has passed. > Upgrade to SolrJ 6.2.1 > -- > > Key: CONNECTORS-1338 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1338 > Project: ManifoldCF > Issue Type: Task > Components: Lucene/SOLR connector >Affects Versions: ManifoldCF 2.6 >Reporter: Shinichiro Abe > Attachments: CONNECTORS-1338.patch, CONNECTORS-1338.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CONNECTORS-1238) Update MCF's lucene/solr connector to use the most recent solr jar etc.
[ https://issues.apache.org/jira/browse/CONNECTORS-1238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15471089#comment-15471089 ] Shinichiro Abe commented on CONNECTORS-1238: Yes, I agree with you. They would remain for the time being. Thank you. > Update MCF's lucene/solr connector to use the most recent solr jar etc. > --- > > Key: CONNECTORS-1238 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1238 > Project: ManifoldCF > Issue Type: Task > Components: Build, Lucene/SOLR connector >Affects Versions: ManifoldCF 2.3 >Reporter: Karl Wright >Assignee: Karl Wright > Fix For: ManifoldCF 2.6 > > > Solr is up to revision 5.3, and 5.3.1 is also imminent. We should bring MCF > up to date, especially since there's now a new "security module" which > provides authentication of various kinds. > It's possible that this will also require updates to httpclient. Some effort > has been made to make Kerberos be more usable in the client, although I don't > know where that effort stands. It may not be available even yet. In general > we try to upgrade to whatever version Solr is happy with, since Solr is so > dependent on precise versions of httpclient. > [~alessandro.benedetti]], perhaps you might have some experience with this? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CONNECTORS-1338) Upgrade to SolrJ 6.2.0
[ https://issues.apache.org/jira/browse/CONNECTORS-1338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shinichiro Abe updated CONNECTORS-1338: --- Attachment: CONNECTORS-1338.patch initial patch. just update version. > Upgrade to SolrJ 6.2.0 > -- > > Key: CONNECTORS-1338 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1338 > Project: ManifoldCF > Issue Type: Task > Components: Lucene/SOLR connector >Affects Versions: ManifoldCF 2.6 >Reporter: Shinichiro Abe > Attachments: CONNECTORS-1338.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CONNECTORS-1238) Update MCF's lucene/solr connector to use the most recent solr jar etc.
[ https://issues.apache.org/jira/browse/CONNECTORS-1238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15470269#comment-15470269 ] Shinichiro Abe commented on CONNECTORS-1238: Hi Karl, As to r1756149, we should upgrade Zookeeper version? I think it' better to use current version 3.4.6, but MCF 2.5 has already released. See http://apache.mirrors.hoobly.com/lucene/solr/5.5.2/changes/Changes.html#v5.5.2.versions_of_major_components And SOLR-8724 => Won't Fix, SOLR-9386 => open. Also, I'd like to upgrade to Solrj 6.2.0 on 2.6-dev. > Update MCF's lucene/solr connector to use the most recent solr jar etc. > --- > > Key: CONNECTORS-1238 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1238 > Project: ManifoldCF > Issue Type: Task > Components: Build, Lucene/SOLR connector >Affects Versions: ManifoldCF 2.3 >Reporter: Karl Wright >Assignee: Karl Wright > Fix For: ManifoldCF 2.6 > > > Solr is up to revision 5.3, and 5.3.1 is also imminent. We should bring MCF > up to date, especially since there's now a new "security module" which > provides authentication of various kinds. > It's possible that this will also require updates to httpclient. Some effort > has been made to make Kerberos be more usable in the client, although I don't > know where that effort stands. It may not be available even yet. In general > we try to upgrade to whatever version Solr is happy with, since Solr is so > dependent on precise versions of httpclient. > [~alessandro.benedetti]], perhaps you might have some experience with this? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (CONNECTORS-1320) Update Japanese translation of End User Document and Framework's UI
[ https://issues.apache.org/jira/browse/CONNECTORS-1320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shinichiro Abe resolved CONNECTORS-1320. Resolution: Fixed close. > Update Japanese translation of End User Document and Framework's UI > --- > > Key: CONNECTORS-1320 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1320 > Project: ManifoldCF > Issue Type: Improvement > Components: Documentation, Framework core >Affects Versions: ManifoldCF 2.4 >Reporter: KOIZUMI Satoru >Assignee: Karl Wright >Priority: Minor > Fix For: ManifoldCF 2.5 > > Attachments: ja_JP.patch, ja_JP.tgz > > > I updated Japanese translation of End User Document and Framework's UI. > This update is not complete, but it might be useful. > - I revised only the section "Overview". > - I only changed the order of subsections in "Output Connection Types". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CONNECTORS-1320) Update Japanese translation of End User Document and Framework's UI
[ https://issues.apache.org/jira/browse/CONNECTORS-1320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15306199#comment-15306199 ] Shinichiro Abe commented on CONNECTORS-1320: r1746070:Fix broken doc build. r1746071:Fix SimpleHistory Japanese name. If needed new name, we have to fix it across documentation xml and ui properties bq. but was not named correctly (welcome_screen is the correct name). Can you check that trunk has what it should have? Today PNG files in ja_JP/ dir almost have "_ja_JP" suffix name and are linked to end-user-documentation.xml. The welcome_screen_ja_JP.PNG may be incorrect name but it works now, if we needed to rename those files, I'd like to file as new issue. > Update Japanese translation of End User Document and Framework's UI > --- > > Key: CONNECTORS-1320 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1320 > Project: ManifoldCF > Issue Type: Improvement > Components: Documentation, Framework core >Affects Versions: ManifoldCF 2.4 >Reporter: KOIZUMI Satoru >Assignee: Karl Wright >Priority: Minor > Fix For: ManifoldCF 2.5 > > Attachments: ja_JP.patch, ja_JP.tgz > > > I updated Japanese translation of End User Document and Framework's UI. > This update is not complete, but it might be useful. > - I revised only the section "Overview". > - I only changed the order of subsections in "Output Connection Types". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CONNECTORS-1320) Update Japanese translation of End User Document and Framework's UI
[ https://issues.apache.org/jira/browse/CONNECTORS-1320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15305213#comment-15305213 ] Shinichiro Abe commented on CONNECTORS-1320: i'll revert the part i pointed out for a few days. https://issues.apache.org/jira/browse/CONNECTORS-1320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15305201#comment-15305201 ] images. One image looked like an update (welcom_screen) but was not named correctly (welcome_screen is the correct name). Can you check that trunk has what it should have? current trunk, since these changes have now been committed. the same text changes are in two files. Thanks! additional patches against current trunk. https://issues.apache.org/jira/browse/CONNECTORS-1320 > Update Japanese translation of End User Document and Framework's UI > --- > > Key: CONNECTORS-1320 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1320 > Project: ManifoldCF > Issue Type: Improvement > Components: Documentation, Framework core >Affects Versions: ManifoldCF 2.4 >Reporter: KOIZUMI Satoru >Assignee: Karl Wright >Priority: Minor > Fix For: ManifoldCF 2.5 > > Attachments: ja_JP.patch, ja_JP.tgz > > > I updated Japanese translation of End User Document and Framework's UI. > This update is not complete, but it might be useful. > - I revised only the section "Overview". > - I only changed the order of subsections in "Output Connection Types". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CONNECTORS-1320) Update Japanese translation of End User Document and Framework's UI
[ https://issues.apache.org/jira/browse/CONNECTORS-1320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15305194#comment-15305194 ] Shinichiro Abe commented on CONNECTORS-1320: Hi, would you update around "SimpleHistory" translation to a same term on end user documentation and ui?, although I prefer a current name than newer one. I reviewed your patch, it's fine to me. Thanks. > Update Japanese translation of End User Document and Framework's UI > --- > > Key: CONNECTORS-1320 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1320 > Project: ManifoldCF > Issue Type: Improvement > Components: Documentation, Framework core >Affects Versions: ManifoldCF 2.4 >Reporter: KOIZUMI Satoru >Priority: Minor > Attachments: ja_JP.patch, ja_JP.tgz > > > I updated Japanese translation of End User Document and Framework's UI. > This update is not complete, but it might be useful. > - I revised only the section "Overview". > - I only changed the order of subsections in "Output Connection Types". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CONNECTORS-1265) Webconnector Bigcrawl test sometimes fails due to a NoHttpResponse exception
[ https://issues.apache.org/jira/browse/CONNECTORS-1265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15054718#comment-15054718 ] Shinichiro Abe commented on CONNECTORS-1265: perhaps we have to reduce the value much more, because the value was 30 ms when we were using CoreConnectionPNames.STALE_CONNECTION_CHECK = true. > Webconnector Bigcrawl test sometimes fails due to a NoHttpResponse exception > > > Key: CONNECTORS-1265 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1265 > Project: ManifoldCF > Issue Type: Bug > Components: Web connector >Affects Versions: ManifoldCF 2.3 >Reporter: Karl Wright >Assignee: Karl Wright > Fix For: ManifoldCF 2.4 > > > Not sure under what conditions this occurs; the same test succeeds under > HSQLDB, so it appears to be timing related. But the http connection created > during the test seems to get whacked once in a while, and needs to retry, > which occurs at an interval of 15 minutes: > {code} > catch (NoHttpResponseException e) > { > throwable = e; > long currentTime = System.currentTimeMillis(); > throw new ServiceInterruption("Timed out waiting for response for > '"+myUrl+"': "+e.getMessage(), e, currentTime + TIME_15MIN, > currentTime + TIME_2HRS,-1,false); > } > {code} > Not clear how long that this has taken place; it may depend sensitively on > the version of httpclient that's in use. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CONNECTORS-1265) Webconnector Bigcrawl test sometimes fails due to a NoHttpResponse exception
[ https://issues.apache.org/jira/browse/CONNECTORS-1265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15054723#comment-15054723 ] Shinichiro Abe commented on CONNECTORS-1265: I see. Thanks! > Webconnector Bigcrawl test sometimes fails due to a NoHttpResponse exception > > > Key: CONNECTORS-1265 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1265 > Project: ManifoldCF > Issue Type: Bug > Components: Web connector >Affects Versions: ManifoldCF 2.3 >Reporter: Karl Wright >Assignee: Karl Wright > Fix For: ManifoldCF 2.4 > > > Not sure under what conditions this occurs; the same test succeeds under > HSQLDB, so it appears to be timing related. But the http connection created > during the test seems to get whacked once in a while, and needs to retry, > which occurs at an interval of 15 minutes: > {code} > catch (NoHttpResponseException e) > { > throwable = e; > long currentTime = System.currentTimeMillis(); > throw new ServiceInterruption("Timed out waiting for response for > '"+myUrl+"': "+e.getMessage(), e, currentTime + TIME_15MIN, > currentTime + TIME_2HRS,-1,false); > } > {code} > Not clear how long that this has taken place; it may depend sensitively on > the version of httpclient that's in use. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CONNECTORS-1265) Webconnector Bigcrawl test sometimes fails due to a NoHttpResponse exception
[ https://issues.apache.org/jira/browse/CONNECTORS-1265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15054709#comment-15054709 ] Shinichiro Abe commented on CONNECTORS-1265: How about reducing setValidateAfterInactivity from 6 to 2000? Http team reduced that default param size from 5000 to 2000: https://svn.apache.org/viewvc?view=revision=1681021 > Webconnector Bigcrawl test sometimes fails due to a NoHttpResponse exception > > > Key: CONNECTORS-1265 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1265 > Project: ManifoldCF > Issue Type: Bug > Components: Web connector >Affects Versions: ManifoldCF 2.3 >Reporter: Karl Wright >Assignee: Karl Wright > Fix For: ManifoldCF 2.4 > > > Not sure under what conditions this occurs; the same test succeeds under > HSQLDB, so it appears to be timing related. But the http connection created > during the test seems to get whacked once in a while, and needs to retry, > which occurs at an interval of 15 minutes: > {code} > catch (NoHttpResponseException e) > { > throwable = e; > long currentTime = System.currentTimeMillis(); > throw new ServiceInterruption("Timed out waiting for response for > '"+myUrl+"': "+e.getMessage(), e, currentTime + TIME_15MIN, > currentTime + TIME_2HRS,-1,false); > } > {code} > Not clear how long that this has taken place; it may depend sensitively on > the version of httpclient that's in use. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CONNECTORS-1263) Use alfresco-indexer-0.8.0(-SNAPSHOT)
[ https://issues.apache.org/jira/browse/CONNECTORS-1263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15052008#comment-15052008 ] Shinichiro Abe commented on CONNECTORS-1263: Hi Maurizio and Karl, At the moment, the testing of last two commits fails. See https://travis-ci.org/apache/manifoldcf/builds org.apache.manifoldcf.crawler.connectors.alfrescowebscript.tests.APISanityHSQLDBIT: (Error while creating file "/Users" [90062-185]) <- Is it related to the failure? > Use alfresco-indexer-0.8.0(-SNAPSHOT) > - > > Key: CONNECTORS-1263 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1263 > Project: ManifoldCF > Issue Type: Task > Components: Alfresco webscript connector >Affects Versions: ManifoldCF 2.2 >Reporter: Maurizio Pillitu > > Update alfresco-webscript connector to use alfresco-indexer-0.8.0 > Right now only 0.8.0-SNAPSHOT version is available on Maven Central; as soon > as Manifold trunk is tested against it, version 0.8.0 will be released (and > Manifold trunk will be updated to use the released version. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CONNECTORS-1261) Solr output connector, commit within seems to be ignored
[ https://issues.apache.org/jira/browse/CONNECTORS-1261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15035212#comment-15035212 ] Shinichiro Abe commented on CONNECTORS-1261: Hi, FWIW, there is a quick fix link when used with "Server update handler=/update": https://lucene.apache.org/solr/5_1_0/solr-solrj/org/apache/solr/client/solrj/SolrClient.html#add(org.apache.solr.common.SolrInputDocument,%20int) -1 or ms. I tend to configure autoCommit/autoSoftCommit on the solr side rather than posting each SolrInputDocument with commitWithin. > Solr output connector, commit within seems to be ignored > > > Key: CONNECTORS-1261 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1261 > Project: ManifoldCF > Issue Type: Bug > Components: Solr-5.x component >Affects Versions: ManifoldCF 2.3 > Environment: Ubuntu 14.04, Solr 5.3.1 >Reporter: Adrian Conlon >Assignee: Karl Wright > Fix For: ManifoldCF 2.3 > > > I've configured my output connector to commit both at the end of a job and > "within" 60 ms (which should be 10 minutes, if my arithmetic is correct). > The effect I'm seeing is that commits are occurring as jobs complete, but if > a set of long running jobs aren't completing quickly, I'm not seeing commits > every 10 minutes. > FYI, here's my Solr output connection configuration > Use extract update handler=false > Solr core name=OasysMailSearch > Server web application=solr > Server name=STGDM16 > Maximum document length=1000 > Commits=true > Solr content field name=body > Server update handler=/update > Server status handler=/admin/ping > Commit within=60 > Server remove handler=/update > Server port=8983 > Solr id field name=id > Server protocol=http -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (CONNECTORS-1259) Use Travis CI
[ https://issues.apache.org/jira/browse/CONNECTORS-1259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shinichiro Abe resolved CONNECTORS-1259. Resolution: Fixed Fix Version/s: ManifoldCF 2.3 The mvn test is passing now. https://travis-ci.org/apache/manifoldcf/builds I suppressed mvn' output that lead to exceed the travis log limit. r1717135: set TERM=dumb, but it didn't work. then r1717189: added a quiet arg to the command directly, it did work. > Use Travis CI > - > > Key: CONNECTORS-1259 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1259 > Project: ManifoldCF > Issue Type: Test > Components: Tests >Affects Versions: ManifoldCF 2.2 >Reporter: Shinichiro Abe >Assignee: Shinichiro Abe >Priority: Minor > Fix For: ManifoldCF 2.3 > > Attachments: CONNECTORS-1259.patch, CONNECTORS-1259.patch, > myrepo-ci-result.png > > > Jenkins builds supports mcf-ant and mcf-mvn today. mcf-ant includes tests but > mcf-mvn does not include test: > {noformat} > mvn -DskipTests=false -DskipITs=true clean -DskipTests -DskipITs install > {noformat} > It would be nice we could run "mvn install" rather than that. > Some Apache projects are using travis ci via Github(1), just putting > .travis.yml to project root directory. > And these tests are not limited timeout as to a build(2). > I tried to test mcf trunk using travis.yml on my repo, then tests work well. > Please see attached image. > We can watch the testing result at https://travis-ci.org/apache/manifoldcf > (public) after committing. And we can add the status badge to somewhere such > as README. > !https://travis-ci.org/sabe1/manifoldcf.svg?branch=trunk! > (1) > https://travis-ci.org/apache/jackrabbit-oak > https://travis-ci.org/apache/storm > (2) > https://docs.travis-ci.com/user/customizing-the-build/#Build-Timeouts -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CONNECTORS-1259) Use Travis CI
[ https://issues.apache.org/jira/browse/CONNECTORS-1259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026397#comment-15026397 ] Shinichiro Abe commented on CONNECTORS-1259: r1716324. I'll wait and see that travis works for a while. Thanks. > Use Travis CI > - > > Key: CONNECTORS-1259 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1259 > Project: ManifoldCF > Issue Type: Test > Components: Tests >Affects Versions: ManifoldCF 2.2 >Reporter: Shinichiro Abe >Assignee: Shinichiro Abe >Priority: Minor > Attachments: CONNECTORS-1259.patch, CONNECTORS-1259.patch, > myrepo-ci-result.png > > > Jenkins builds supports mcf-ant and mcf-mvn today. mcf-ant includes tests but > mcf-mvn does not include test: > {noformat} > mvn -DskipTests=false -DskipITs=true clean -DskipTests -DskipITs install > {noformat} > It would be nice we could run "mvn install" rather than that. > Some Apache projects are using travis ci via Github(1), just putting > .travis.yml to project root directory. > And these tests are not limited timeout as to a build(2). > I tried to test mcf trunk using travis.yml on my repo, then tests work well. > Please see attached image. > We can watch the testing result at https://travis-ci.org/apache/manifoldcf > (public) after committing. And we can add the status badge to somewhere such > as README. > !https://travis-ci.org/sabe1/manifoldcf.svg?branch=trunk! > (1) > https://travis-ci.org/apache/jackrabbit-oak > https://travis-ci.org/apache/storm > (2) > https://docs.travis-ci.com/user/customizing-the-build/#Build-Timeouts -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CONNECTORS-1253) Add QuickStart application
[ https://issues.apache.org/jira/browse/CONNECTORS-1253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shinichiro Abe updated CONNECTORS-1253: --- Attachment: CONNECTORS-1253.patch Updated patch that be fixed about dependency plugin warning. Next, I will fix the connectors.xml which doesn't have other connector class. > Add QuickStart application > -- > > Key: CONNECTORS-1253 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1253 > Project: ManifoldCF > Issue Type: Improvement > Components: Build >Affects Versions: ManifoldCF 2.2 >Reporter: Shinichiro Abe > Fix For: ManifoldCF next > > Attachments: CONNECTORS-1253.patch, CONNECTORS-1253.patch > > > A simple application packaged by maven in case developers do not use ant > build. Similar to maven exec. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (CONNECTORS-1253) Add QuickStart application
[ https://issues.apache.org/jira/browse/CONNECTORS-1253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shinichiro Abe reassigned CONNECTORS-1253: -- Assignee: Shinichiro Abe > Add QuickStart application > -- > > Key: CONNECTORS-1253 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1253 > Project: ManifoldCF > Issue Type: Improvement > Components: Build >Affects Versions: ManifoldCF 2.2 >Reporter: Shinichiro Abe >Assignee: Shinichiro Abe > Fix For: ManifoldCF next > > Attachments: CONNECTORS-1253.patch, CONNECTORS-1253.patch > > > A simple application packaged by maven in case developers do not use ant > build. Similar to maven exec. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (CONNECTORS-1052) Invoking mvn from ant
[ https://issues.apache.org/jira/browse/CONNECTORS-1052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shinichiro Abe resolved CONNECTORS-1052. Resolution: Won't Fix Fix Version/s: (was: ManifoldCF next) This issue is for both ant and maven testing essentially. So I switch to CONNECTORS-1259. > Invoking mvn from ant > - > > Key: CONNECTORS-1052 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1052 > Project: ManifoldCF > Issue Type: Improvement > Components: Alfresco connector, Build >Reporter: Shinichiro Abe >Assignee: Shinichiro Abe > > Using ant 'test', AlfrescoConnector test is ignored. Because ant doesn't call > 'mvn package' at test-materials/alfresco-war and copy the war to proper test > dir. It would be nice if we could call mvn command from ant, including > compile jars and test. > Currently, this test is skipped when we use ant. > {noformat} > $ ant make-core-deps make-deps build test > Or > $ cd connectors/alfresco > $ ant run-IT-HSQLDB > ... > pretest-warn: > [echo] Alfresco Connector integration tests cannot be be performed > without alfresco.war > {noformat} > Also, it seems there is a difference between build.xml and pom.xml about test > content(alfresco war/client version). > I'm not sure what is correct. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CONNECTORS-1253) Add QuickStart application
[ https://issues.apache.org/jira/browse/CONNECTORS-1253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shinichiro Abe updated CONNECTORS-1253: --- Fix Version/s: ManifoldCF next > Add QuickStart application > -- > > Key: CONNECTORS-1253 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1253 > Project: ManifoldCF > Issue Type: Improvement > Components: Build >Affects Versions: ManifoldCF 2.2 >Reporter: Shinichiro Abe > Fix For: ManifoldCF next > > Attachments: CONNECTORS-1253.patch, CONNECTORS-1253.patch > > > A simple application packaged by maven in case developers do not use ant > build. Similar to maven exec. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CONNECTORS-1259) Use Travis CI
[ https://issues.apache.org/jira/browse/CONNECTORS-1259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shinichiro Abe updated CONNECTORS-1259: --- Attachment: CONNECTORS-1259.patch Updated patch that supports running both ant and maven testing in parallel. each test took 40-45 minutes on travis. > Use Travis CI > - > > Key: CONNECTORS-1259 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1259 > Project: ManifoldCF > Issue Type: Test > Components: Tests >Affects Versions: ManifoldCF 2.2 >Reporter: Shinichiro Abe >Assignee: Shinichiro Abe >Priority: Minor > Attachments: CONNECTORS-1259.patch, CONNECTORS-1259.patch, > myrepo-ci-result.png > > > Jenkins builds supports mcf-ant and mcf-mvn today. mcf-ant includes tests but > mcf-mvn does not include test: > {noformat} > mvn -DskipTests=false -DskipITs=true clean -DskipTests -DskipITs install > {noformat} > It would be nice we could run "mvn install" rather than that. > Some Apache projects are using travis ci via Github(1), just putting > .travis.yml to project root directory. > And these tests are not limited timeout as to a build(2). > I tried to test mcf trunk using travis.yml on my repo, then tests work well. > Please see attached image. > We can watch the testing result at https://travis-ci.org/apache/manifoldcf > (public) after committing. And we can add the status badge to somewhere such > as README. > !https://travis-ci.org/sabe1/manifoldcf.svg?branch=trunk! > (1) > https://travis-ci.org/apache/jackrabbit-oak > https://travis-ci.org/apache/storm > (2) > https://docs.travis-ci.com/user/customizing-the-build/#Build-Timeouts -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CONNECTORS-1259) Use Travis CI
[ https://issues.apache.org/jira/browse/CONNECTORS-1259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15024588#comment-15024588 ] Shinichiro Abe commented on CONNECTORS-1259: The build trigger is different between Jenkins and Travis. Jenkins is poring new commits which is scheduled. OTOH, Travis runs a build on the commits when developers commit. https://docs.travis-ci.com/user/getting-started/ We have a problem, mvn developers doesn't run ant test, ant developers doesn't run mvn test. Travis will be able to verify commits just in time. Of course we have to avoid the automated builds failure like many build emails in lucene/solr team. > Use Travis CI > - > > Key: CONNECTORS-1259 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1259 > Project: ManifoldCF > Issue Type: Test > Components: Tests >Affects Versions: ManifoldCF 2.2 >Reporter: Shinichiro Abe >Assignee: Shinichiro Abe >Priority: Minor > Attachments: CONNECTORS-1259.patch, myrepo-ci-result.png > > > Jenkins builds supports mcf-ant and mcf-mvn today. mcf-ant includes tests but > mcf-mvn does not include test: > {noformat} > mvn -DskipTests=false -DskipITs=true clean -DskipTests -DskipITs install > {noformat} > It would be nice we could run "mvn install" rather than that. > Some Apache projects are using travis ci via Github(1), just putting > .travis.yml to project root directory. > And these tests are not limited timeout as to a build(2). > I tried to test mcf trunk using travis.yml on my repo, then tests work well. > Please see attached image. > We can watch the testing result at https://travis-ci.org/apache/manifoldcf > (public) after committing. And we can add the status badge to somewhere such > as README. > !https://travis-ci.org/sabe1/manifoldcf.svg?branch=trunk! > (1) > https://travis-ci.org/apache/jackrabbit-oak > https://travis-ci.org/apache/storm > (2) > https://docs.travis-ci.com/user/customizing-the-build/#Build-Timeouts -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CONNECTORS-1259) Use Travis CI
[ https://issues.apache.org/jira/browse/CONNECTORS-1259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15024546#comment-15024546 ] Shinichiro Abe commented on CONNECTORS-1259: Hi Karl, FYI: https://blogs.apache.org/infra/entry/apache_gains_additional_travis_ci It seems INFRA team supports "30 concurrent builds", so it does not mean only mcf-build-machine guarantees to build. There have possibilities to be overloaded on the build machine. Should I have to config a param -DskipITs=true or make all tests be disabled? > Use Travis CI > - > > Key: CONNECTORS-1259 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1259 > Project: ManifoldCF > Issue Type: Test > Components: Tests >Affects Versions: ManifoldCF 2.2 >Reporter: Shinichiro Abe >Assignee: Shinichiro Abe >Priority: Minor > Attachments: CONNECTORS-1259.patch, myrepo-ci-result.png > > > Jenkins builds supports mcf-ant and mcf-mvn today. mcf-ant includes tests but > mcf-mvn does not include test: > {noformat} > mvn -DskipTests=false -DskipITs=true clean -DskipTests -DskipITs install > {noformat} > It would be nice we could run "mvn install" rather than that. > Some Apache projects are using travis ci via Github(1), just putting > .travis.yml to project root directory. > And these tests are not limited timeout as to a build(2). > I tried to test mcf trunk using travis.yml on my repo, then tests work well. > Please see attached image. > We can watch the testing result at https://travis-ci.org/apache/manifoldcf > (public) after committing. And we can add the status badge to somewhere such > as README. > !https://travis-ci.org/sabe1/manifoldcf.svg?branch=trunk! > (1) > https://travis-ci.org/apache/jackrabbit-oak > https://travis-ci.org/apache/storm > (2) > https://docs.travis-ci.com/user/customizing-the-build/#Build-Timeouts -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CONNECTORS-1259) Use Travis CI
[ https://issues.apache.org/jira/browse/CONNECTORS-1259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shinichiro Abe updated CONNECTORS-1259: --- Attachment: myrepo-ci-result.png CONNECTORS-1259.patch image: build status image. patch: add .travis.yml. email notification is hard to configure, so disabled. > Use Travis CI > - > > Key: CONNECTORS-1259 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1259 > Project: ManifoldCF > Issue Type: Test > Components: Tests >Affects Versions: ManifoldCF 2.2 >Reporter: Shinichiro Abe >Assignee: Shinichiro Abe >Priority: Minor > Attachments: CONNECTORS-1259.patch, myrepo-ci-result.png > > > Jenkins builds supports mcf-ant and mcf-mvn today. mcf-ant includes tests but > mcf-mvn does not include test: > {noformat} > mvn -DskipTests=false -DskipITs=true clean -DskipTests -DskipITs install > {noformat} > It would be nice we could run "mvn install" rather than that. > Some Apache projects are using travis ci via Github(1), just putting > .travis.yml to project root directory. > And these tests are not limited timeout as to a build(2). > I tried to test mcf trunk using travis.yml on my repo, then tests work well. > Please see attached image. > We can watch the testing result at https://travis-ci.org/apache/manifoldcf > (public) after committing. And we can add the status badge to somewhere such > as README. > !https://travis-ci.org/sabe1/manifoldcf.svg?branch=trunk! > (1) > https://travis-ci.org/apache/jackrabbit-oak > https://travis-ci.org/apache/storm > (2) > https://docs.travis-ci.com/user/customizing-the-build/#Build-Timeouts -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CONNECTORS-1253) Add QuickStart application
[ https://issues.apache.org/jira/browse/CONNECTORS-1253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shinichiro Abe updated CONNECTORS-1253: --- Description: A simple application packaged by maven in case developers do not use ant build. Similar to maven exec. > Add QuickStart application > -- > > Key: CONNECTORS-1253 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1253 > Project: ManifoldCF > Issue Type: Bug > Components: Build >Affects Versions: ManifoldCF 2.2 >Reporter: Shinichiro Abe > > A simple application packaged by maven in case developers do not use ant > build. Similar to maven exec. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CONNECTORS-1253) Add QuickStart application
[ https://issues.apache.org/jira/browse/CONNECTORS-1253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shinichiro Abe updated CONNECTORS-1253: --- Attachment: CONNECTORS-1253.patch patch: {noformat} $ mvn install -DskipTests $ cd framework/jetty-runner/target/apache-manifoldcf-${project.version}/ $ ./bin/manifoldcf {noformat} directory: apache-manifoldcf-2.3-SNAPSHOT L bin L conf L lib L webapps > Add QuickStart application > -- > > Key: CONNECTORS-1253 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1253 > Project: ManifoldCF > Issue Type: Bug > Components: Build >Affects Versions: ManifoldCF 2.2 >Reporter: Shinichiro Abe > Attachments: CONNECTORS-1253.patch > > > A simple application packaged by maven in case developers do not use ant > build. Similar to maven exec. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CONNECTORS-1253) Add QuickStart application
[ https://issues.apache.org/jira/browse/CONNECTORS-1253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shinichiro Abe updated CONNECTORS-1253: --- Issue Type: Improvement (was: Bug) > Add QuickStart application > -- > > Key: CONNECTORS-1253 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1253 > Project: ManifoldCF > Issue Type: Improvement > Components: Build >Affects Versions: ManifoldCF 2.2 >Reporter: Shinichiro Abe > Attachments: CONNECTORS-1253.patch > > > A simple application packaged by maven in case developers do not use ant > build. Similar to maven exec. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CONNECTORS-1252) jetty.xml is missing in jetty-runner QuickStart
Shinichiro Abe created CONNECTORS-1252: -- Summary: jetty.xml is missing in jetty-runner QuickStart Key: CONNECTORS-1252 URL: https://issues.apache.org/jira/browse/CONNECTORS-1252 Project: ManifoldCF Issue Type: Bug Components: Framework core Affects Versions: ManifoldCF 2.2 Reporter: Shinichiro Abe Assignee: Shinichiro Abe Priority: Minor mvn exec:exec fails at startup. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (CONNECTORS-1252) jetty.xml is missing in jetty-runner QuickStart
[ https://issues.apache.org/jira/browse/CONNECTORS-1252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shinichiro Abe resolved CONNECTORS-1252. Resolution: Fixed Fix Version/s: ManifoldCF 2.3 r1715075. > jetty.xml is missing in jetty-runner QuickStart > --- > > Key: CONNECTORS-1252 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1252 > Project: ManifoldCF > Issue Type: Bug > Components: Framework core >Affects Versions: ManifoldCF 2.2 >Reporter: Shinichiro Abe >Assignee: Shinichiro Abe >Priority: Minor > Fix For: ManifoldCF 2.3 > > > mvn exec:exec fails at startup. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CONNECTORS-1254) jetty dependencies is missing in jetty-runner
[ https://issues.apache.org/jira/browse/CONNECTORS-1254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shinichiro Abe updated CONNECTORS-1254: --- Attachment: CONNECTORS-1254.patch patch. maven test passed. I'll commit it after ant test. > jetty dependencies is missing in jetty-runner > - > > Key: CONNECTORS-1254 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1254 > Project: ManifoldCF > Issue Type: Bug > Components: Framework core >Affects Versions: ManifoldCF 2.2 >Reporter: Shinichiro Abe >Priority: Minor > Attachments: CONNECTORS-1254.patch > > > In current trunk, I run jetty-runner by mvn exec and operate crawler ui. > Then I saw the following message on my console. > {noformat} > WARN: Unknown target VM 1.7 ignored. > WARN: Unknown source VM 1.7 ignored. > WARN: Unknown target VM 1.7 ignored. > WARN: Unknown source VM 1.7 ignored. > ... > {noformat} > When I append lost jetty dependencies in jetty-runner' pom, that message > doesn't happen. > It seems I've forgotten to do that on CONNECTORS-1050. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CONNECTORS-1254) jetty dependencies is missing in jetty-runner
Shinichiro Abe created CONNECTORS-1254: -- Summary: jetty dependencies is missing in jetty-runner Key: CONNECTORS-1254 URL: https://issues.apache.org/jira/browse/CONNECTORS-1254 Project: ManifoldCF Issue Type: Bug Components: Framework core Affects Versions: ManifoldCF 2.2 Reporter: Shinichiro Abe Priority: Minor In current trunk, I run jetty-runner by mvn exec and operate crawler ui. Then I saw the following message on my console. {noformat} WARN: Unknown target VM 1.7 ignored. WARN: Unknown source VM 1.7 ignored. WARN: Unknown target VM 1.7 ignored. WARN: Unknown source VM 1.7 ignored. ... {noformat} When I append lost jetty dependencies in jetty-runner' pom, that message doesn't happen. It seems I've forgotten to do that on CONNECTORS-1050. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CONNECTORS-1255) Ignore CheckObjectIDTest
Shinichiro Abe created CONNECTORS-1255: -- Summary: Ignore CheckObjectIDTest Key: CONNECTORS-1255 URL: https://issues.apache.org/jira/browse/CONNECTORS-1255 Project: ManifoldCF Issue Type: Bug Components: CMIS connector Reporter: Shinichiro Abe Currently trunk, ant test fails because of : {noformat} run-tests: [junit] Testsuite: org.apache.manifoldcf.crawler.connectors.cmis.tests.CheckObjectIDTest [junit] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.014 sec [junit] [junit] Testcase: initializationError(org.apache.manifoldcf.crawler.connectors.cmis.tests.CheckObjectIDTest): Caused an ERROR [junit] No runnable methods [junit] java.lang.Exception: No runnable methods [junit] at java.lang.reflect.Constructor.newInstance(Constructor.java:422) [junit] {noformat} Since that is just an echo test, I'll make it disabled. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (CONNECTORS-1255) Ignore CheckObjectIDTest
[ https://issues.apache.org/jira/browse/CONNECTORS-1255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shinichiro Abe resolved CONNECTORS-1255. Resolution: Fixed Assignee: Shinichiro Abe Fix Version/s: ManifoldCF 2.3 r1715112. > Ignore CheckObjectIDTest > > > Key: CONNECTORS-1255 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1255 > Project: ManifoldCF > Issue Type: Bug > Components: CMIS connector >Reporter: Shinichiro Abe >Assignee: Shinichiro Abe > Fix For: ManifoldCF 2.3 > > > Currently trunk, ant test fails because of : > {noformat} > run-tests: > [junit] Testsuite: > org.apache.manifoldcf.crawler.connectors.cmis.tests.CheckObjectIDTest > [junit] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: > 0.014 sec > [junit] > [junit] Testcase: > initializationError(org.apache.manifoldcf.crawler.connectors.cmis.tests.CheckObjectIDTest): > Caused an ERROR > [junit] No runnable methods > [junit] java.lang.Exception: No runnable methods > [junit] at > java.lang.reflect.Constructor.newInstance(Constructor.java:422) > [junit] > {noformat} > Since that is just an echo test, I'll make it disabled. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (CONNECTORS-1254) jetty dependencies is missing in jetty-runner
[ https://issues.apache.org/jira/browse/CONNECTORS-1254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shinichiro Abe resolved CONNECTORS-1254. Resolution: Fixed Assignee: Shinichiro Abe Fix Version/s: ManifoldCF 2.3 The ant testing has passed. r1715118. > jetty dependencies is missing in jetty-runner > - > > Key: CONNECTORS-1254 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1254 > Project: ManifoldCF > Issue Type: Bug > Components: Framework core >Affects Versions: ManifoldCF 2.2 >Reporter: Shinichiro Abe >Assignee: Shinichiro Abe >Priority: Minor > Fix For: ManifoldCF 2.3 > > Attachments: CONNECTORS-1254.patch > > > In current trunk, I run jetty-runner by mvn exec and operate crawler ui. > Then I saw the following message on my console. > {noformat} > WARN: Unknown target VM 1.7 ignored. > WARN: Unknown source VM 1.7 ignored. > WARN: Unknown target VM 1.7 ignored. > WARN: Unknown source VM 1.7 ignored. > ... > {noformat} > When I append lost jetty dependencies in jetty-runner' pom, that message > doesn't happen. > It seems I've forgotten to do that on CONNECTORS-1050. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CONNECTORS-1248) CMIS connector doesn't detect changes to content under some conditions
[ https://issues.apache.org/jira/browse/CONNECTORS-1248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14979420#comment-14979420 ] Shinichiro Abe commented on CONNECTORS-1248: Ah, trunk has been fixed! Thanks. > CMIS connector doesn't detect changes to content under some conditions > -- > > Key: CONNECTORS-1248 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1248 > Project: ManifoldCF > Issue Type: Bug > Components: CMIS connector >Affects Versions: ManifoldCF 2.2 >Reporter: Karl Wright >Assignee: Karl Wright > Fix For: ManifoldCF 2.3 > > Attachments: sample-patch-for1.7.2.txt > > > As commented elsewhere: > {quote} > When I worked with the CMIS connector I had to modify the logic to append > document.getLastModificationDate().getTimeInMillis() to the versionString for > it to pick up changes. The Alfresco document version won't update when you > modify metadata. My memory is terrible, but I believe that even modifying > content may not do it unless you have the proper 'versioning' aspect applied. > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CONNECTORS-1181) Apache Stanbol Transformation Connector
[ https://issues.apache.org/jira/browse/CONNECTORS-1181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14944768#comment-14944768 ] Shinichiro Abe commented on CONNECTORS-1181: Hi, any progress here? I am playing with Stanbol REST API by using kuromoji-nlp [example|https://sites.google.com/site/shinichiroapacheorg/stanbol-in-5-minutes] a bit today so I'm interested in this ticket. Thanks. > Apache Stanbol Transformation Connector > --- > > Key: CONNECTORS-1181 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1181 > Project: ManifoldCF > Issue Type: Wish >Affects Versions: ManifoldCF 1.8.2, ManifoldCF 2.0.2 >Reporter: Rafa Haro >Assignee: Rafa Haro >Priority: Minor > Labels: connect, transformation > Fix For: ManifoldCF 2.3 > > > Apache Stanbol (https://stanbol.apache.org/) provides a set of reusable > components for semantic content management. One of this component is the > Enhancer (https://stanbol.apache.org/docs/trunk/components/enhancer/) which > allows to extract features and semantic metadata from textual content like > entities/concepts from domain ontologies, named entities and so on. > Apache Stanbol provides an easy-to-use REST API. The main idea behind this > transformation connector would be to enrich the Repository Document's > (string) content with a configured Stanbol processing chain. The > Transformation Connector would allow the user to configure the metadata that > will be extracted from the Enhancer result for including it as RD's metadata > This behavior come to somehow replace the functionality of the old Apache > Stanbol CMS Adapter > (https://stanbol.apache.org/docs/trunk/components/cmsadapter/) and ContentHub > (https://stanbol.apache.org/docs/trunk/components/contenthub/) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (CONNECTORS-1234) TikaExtractor based indexing on Elasticsearch connector
[ https://issues.apache.org/jira/browse/CONNECTORS-1234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shinichiro Abe resolved CONNECTORS-1234. Resolution: Fixed r1705040. > TikaExtractor based indexing on Elasticsearch connector > --- > > Key: CONNECTORS-1234 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1234 > Project: ManifoldCF > Issue Type: Improvement >Reporter: Shinichiro Abe >Assignee: Shinichiro Abe > Fix For: ManifoldCF 2.3 > > Attachments: CONNECTORS-1234.patch, CONNECTORS-1234.patch > > > We could add the use-mapper-attachments flag. > Default to true, current spec which asks for mapper-attachments plugin on ES > side. > If false, it allows us to index the content and metadata that extracted from > files through Tika transformer, which means there is no need to install that > plugin and put base64 encoded content. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CONNECTORS-1234) TikaExtractor based indexing on Elasticsearch connector
[ https://issues.apache.org/jira/browse/CONNECTORS-1234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shinichiro Abe updated CONNECTORS-1234: --- Attachment: CONNECTORS-1234.patch Updated patch that removes the document length checking from previous patch. I'm ready to commit. > TikaExtractor based indexing on Elasticsearch connector > --- > > Key: CONNECTORS-1234 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1234 > Project: ManifoldCF > Issue Type: Improvement >Reporter: Shinichiro Abe >Assignee: Shinichiro Abe > Fix For: ManifoldCF 2.3 > > Attachments: CONNECTORS-1234.patch, CONNECTORS-1234.patch > > > We could add the use-mapper-attachments flag. > Default to true, current spec which asks for mapper-attachments plugin on ES > side. > If false, it allows us to index the content and metadata that extracted from > files through Tika transformer, which means there is no need to install that > plugin and put base64 encoded content. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (CONNECTORS-1231) Upgrade to JUnit 4.12
[ https://issues.apache.org/jira/browse/CONNECTORS-1231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shinichiro Abe resolved CONNECTORS-1231. Resolution: Fixed Committed to trunk. r1703359. "ant test" passed in ant 1.8.2 without putting junit-4.12.jar in ANT_HOME/lib. Sorry for confusing. > Upgrade to JUnit 4.12 > - > > Key: CONNECTORS-1231 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1231 > Project: ManifoldCF > Issue Type: Test > Components: Tests >Affects Versions: ManifoldCF 2.3 >Reporter: Shinichiro Abe >Assignee: Shinichiro Abe >Priority: Minor > Fix For: ManifoldCF 2.3 > > Attachments: CONNECTORS-1231.patch, CONNECTORS-1231.patch > > > Background: dev mail thread > [link|http://mail-archives.apache.org/mod_mbox/manifoldcf-dev/201508.mbox/%3CCA%2BeTv_XTcU3bRTdS2GTi3apQCkJrzhnbS8oUZkcan%2BiW5goN8A%40mail.gmail.com%3E]. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CONNECTORS-1231) Upgrade to JUnit 4.12
[ https://issues.apache.org/jira/browse/CONNECTORS-1231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14745329#comment-14745329 ] Shinichiro Abe commented on CONNECTORS-1231: Thanks Karl for testing. I got error above when using ant 1.8.2, then I put junit 4.12, the test passed. IIUC, ant-junit4.jar version is defined in libraries.properties. my ant 1.8.2 -> junit 4.8.1 https://github.com/apache/ant/blob/ANT_182/lib/libraries.properties#L48 ant latest -> junit 4.11 https://github.com/apache/ant/blob/ANT_196/lib/libraries.properties#L48 > Upgrade to JUnit 4.12 > - > > Key: CONNECTORS-1231 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1231 > Project: ManifoldCF > Issue Type: Test > Components: Tests >Affects Versions: ManifoldCF 2.3 >Reporter: Shinichiro Abe >Assignee: Shinichiro Abe >Priority: Minor > Fix For: ManifoldCF 2.3 > > Attachments: CONNECTORS-1231.patch, CONNECTORS-1231.patch > > > Background: dev mail thread > [link|http://mail-archives.apache.org/mod_mbox/manifoldcf-dev/201508.mbox/%3CCA%2BeTv_XTcU3bRTdS2GTi3apQCkJrzhnbS8oUZkcan%2BiW5goN8A%40mail.gmail.com%3E]. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CONNECTORS-1231) Upgrade to JUnit 4.12
[ https://issues.apache.org/jira/browse/CONNECTORS-1231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shinichiro Abe updated CONNECTORS-1231: --- Attachment: CONNECTORS-1231.patch Updated patch. I had been hitting timeout errors sometimes for a few weeks when testing by maven after upgrading to 2.14. The test failed in the 1st time but passed in the 2nd time, or the test failed on another machine but passed on my machine. a sample error I got: {noformat} Tests in error: sessionCrawl(org.apache.manifoldcf.crawler.connectors.webcrawler.tests.SessionLoginHSQLDBIT): ManifoldCF did not terminate in the allotted time of 60 milliseconds {noformat} Apparently those were caused by forkMode-related params in surefire plugin configs. {noformat} [WARNING] The parameter forkMode is deprecated since 2.14. Use forkCount and reuseForks instead. {noformat} In JUnit 2.14, reuseForks default is true that means to reuse the processes to execute the next tests. In MCF POMs, there are forkMode=always that means reuseForks param could be false according to [1]: And maven-failsafe-plugin can support reuseForks param since 2.13[2], but the current version is 2.12.3. I've replaced forkMode with forkCount and reuseForks, and upgraded the failsafe version in this patch. Also we may have to put junit-4.12.jar in ANT_HOME/lib when ant test runs. All tests(mvn clean install, ant test, ant uitest) can pass without the warning now, but I'm afraid the tests may fail in someone else's machine. [1] https://maven.apache.org/surefire/maven-surefire-plugin/examples/fork-options-and-parallel-execution.html [2] https://maven.apache.org/components/surefire/maven-failsafe-plugin/integration-test-mojo.html > Upgrade to JUnit 4.12 > - > > Key: CONNECTORS-1231 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1231 > Project: ManifoldCF > Issue Type: Test > Components: Tests >Affects Versions: ManifoldCF 2.3 >Reporter: Shinichiro Abe >Assignee: Shinichiro Abe >Priority: Minor > Fix For: ManifoldCF 2.3 > > Attachments: CONNECTORS-1231.patch, CONNECTORS-1231.patch > > > Background: dev mail thread > [link|http://mail-archives.apache.org/mod_mbox/manifoldcf-dev/201508.mbox/%3CCA%2BeTv_XTcU3bRTdS2GTi3apQCkJrzhnbS8oUZkcan%2BiW5goN8A%40mail.gmail.com%3E]. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CONNECTORS-1236) Upgrade to tika 1.10
[ https://issues.apache.org/jira/browse/CONNECTORS-1236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14738340#comment-14738340 ] Shinichiro Abe commented on CONNECTORS-1236: Hi Karl, I've watched your commit in the branch. FYI: For r1702136, you want to append hamcrest-core*.jar since junit 4.12 depends it. Please see CONNECTORS-1231 patch. For r1702176, please dedupe protobuf-java*.jar in build.xml. Also we could append more various binary files to test-documents dir in TikaParserTest so that we strengthen tika test. > Upgrade to tika 1.10 > > > Key: CONNECTORS-1236 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1236 > Project: ManifoldCF > Issue Type: Bug > Components: Tika extractor >Affects Versions: ManifoldCF 2.3 >Reporter: Patryk Szweda >Assignee: Karl Wright > > Upgrade to 1.10. This will require rework of dependencies. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CONNECTORS-1234) TikaExtractor based indexing on Elasticsearch connector
[ https://issues.apache.org/jira/browse/CONNECTORS-1234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shinichiro Abe updated CONNECTORS-1234: --- Fix Version/s: ManifoldCF 2.3 > TikaExtractor based indexing on Elasticsearch connector > --- > > Key: CONNECTORS-1234 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1234 > Project: ManifoldCF > Issue Type: Improvement >Reporter: Shinichiro Abe >Assignee: Shinichiro Abe > Fix For: ManifoldCF 2.3 > > Attachments: CONNECTORS-1234.patch > > > We could add the use-mapper-attachments flag. > Default to true, current spec which asks for mapper-attachments plugin on ES > side. > If false, it allows us to index the content and metadata that extracted from > files through Tika transformer, which means there is no need to install that > plugin and put base64 encoded content. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (CONNECTORS-1230) Add writeLimit option on Tika extractor
[ https://issues.apache.org/jira/browse/CONNECTORS-1230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shinichiro Abe resolved CONNECTORS-1230. Resolution: Fixed r1700924. > Add writeLimit option on Tika extractor > --- > > Key: CONNECTORS-1230 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1230 > Project: ManifoldCF > Issue Type: Improvement > Components: Tika extractor >Affects Versions: ManifoldCF 2.3 >Reporter: Shinichiro Abe >Assignee: Shinichiro Abe >Priority: Minor > Fix For: ManifoldCF 2.3 > > Attachments: CONNECTORS-1230.patch, CONNECTORS-1230.patch > > > To filter out documents that exceed specified maximum length of strings. > Default to -1 for no limit, this is a current default behavior. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CONNECTORS-1230) Add writeLimit option on Tika extractor
[ https://issues.apache.org/jira/browse/CONNECTORS-1230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shinichiro Abe updated CONNECTORS-1230: --- Attachment: CONNECTORS-1230.patch Updated patch: fix for the convention that an empty text box means "no limit". I'll commit tomorrow. > Add writeLimit option on Tika extractor > --- > > Key: CONNECTORS-1230 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1230 > Project: ManifoldCF > Issue Type: Improvement > Components: Tika extractor >Affects Versions: ManifoldCF 2.3 >Reporter: Shinichiro Abe >Assignee: Shinichiro Abe >Priority: Minor > Fix For: ManifoldCF 2.3 > > Attachments: CONNECTORS-1230.patch, CONNECTORS-1230.patch > > > To filter out documents that exceed specified maximum length of strings. > Default to -1 for no limit, this is a current default behavior. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CONNECTORS-1234) TikaExtractor based indexing on Elasticsearch connector
[ https://issues.apache.org/jira/browse/CONNECTORS-1234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14717543#comment-14717543 ] Shinichiro Abe commented on CONNECTORS-1234: bq. it was already dropped I think it is not correct because there are the checks [here|https://svn.apache.org/viewvc/manifoldcf/trunk/connectors/solr/connector/src/main/java/org/apache/manifoldcf/agents/output/solr/SolrConnector.java?view=markup#l849] and [here|https://svn.apache.org/viewvc/manifoldcf/trunk/connectors/solr/connector/src/main/java/org/apache/manifoldcf/agents/output/solr/HttpPoster.java?view=markup#l543]. It should be replaced with Document Filter transformer instead. I'll file new JIRA. TikaExtractor based indexing on Elasticsearch connector --- Key: CONNECTORS-1234 URL: https://issues.apache.org/jira/browse/CONNECTORS-1234 Project: ManifoldCF Issue Type: Improvement Reporter: Shinichiro Abe Assignee: Shinichiro Abe Attachments: CONNECTORS-1234.patch We could add the use-mapper-attachments flag. Default to true, current spec which asks for mapper-attachments plugin on ES side. If false, it allows us to index the content and metadata that extracted from files through Tika transformer, which means there is no need to install that plugin and put base64 encoded content. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CONNECTORS-1234) TikaExtractor based indexing on Elasticsearch connector
[ https://issues.apache.org/jira/browse/CONNECTORS-1234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14717564#comment-14717564 ] Shinichiro Abe edited comment on CONNECTORS-1234 at 8/27/15 9:28 PM: - Yes, because of Lucene, I think. If users does not configure [ignore_above|https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-core-types.html] for content field, some exceptions will be thrown on ES side. was (Author: shinichiro abe): Yes, because of Lucene, I think. If users have to configure [ignore_above|https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-core-types.html], some exceptions will be thrown on ES side. TikaExtractor based indexing on Elasticsearch connector --- Key: CONNECTORS-1234 URL: https://issues.apache.org/jira/browse/CONNECTORS-1234 Project: ManifoldCF Issue Type: Improvement Reporter: Shinichiro Abe Assignee: Shinichiro Abe Attachments: CONNECTORS-1234.patch We could add the use-mapper-attachments flag. Default to true, current spec which asks for mapper-attachments plugin on ES side. If false, it allows us to index the content and metadata that extracted from files through Tika transformer, which means there is no need to install that plugin and put base64 encoded content. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CONNECTORS-1234) TikaExtractor based indexing on Elasticsearch connector
[ https://issues.apache.org/jira/browse/CONNECTORS-1234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14717564#comment-14717564 ] Shinichiro Abe commented on CONNECTORS-1234: Yes, because of Lucene, I think. If users have to configure [ignore_above|https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-core-types.html], some exceptions will be thrown on ES side. TikaExtractor based indexing on Elasticsearch connector --- Key: CONNECTORS-1234 URL: https://issues.apache.org/jira/browse/CONNECTORS-1234 Project: ManifoldCF Issue Type: Improvement Reporter: Shinichiro Abe Assignee: Shinichiro Abe Attachments: CONNECTORS-1234.patch We could add the use-mapper-attachments flag. Default to true, current spec which asks for mapper-attachments plugin on ES side. If false, it allows us to index the content and metadata that extracted from files through Tika transformer, which means there is no need to install that plugin and put base64 encoded content. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CONNECTORS-1234) TikaExtractor based indexing on Elasticsearch connector
[ https://issues.apache.org/jira/browse/CONNECTORS-1234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14717712#comment-14717712 ] Shinichiro Abe commented on CONNECTORS-1234: bq. they are streamed Does this means home-grown ES httpclient? If yes, this is not constructing a document in memory which differs from SolrJ client. I'll post very large files later, I don't have a large file at the moment. Thanks. TikaExtractor based indexing on Elasticsearch connector --- Key: CONNECTORS-1234 URL: https://issues.apache.org/jira/browse/CONNECTORS-1234 Project: ManifoldCF Issue Type: Improvement Reporter: Shinichiro Abe Assignee: Shinichiro Abe Attachments: CONNECTORS-1234.patch We could add the use-mapper-attachments flag. Default to true, current spec which asks for mapper-attachments plugin on ES side. If false, it allows us to index the content and metadata that extracted from files through Tika transformer, which means there is no need to install that plugin and put base64 encoded content. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CONNECTORS-1234) TikaExtractor based indexing on Elasticsearch connector
[ https://issues.apache.org/jira/browse/CONNECTORS-1234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14712656#comment-14712656 ] Shinichiro Abe commented on CONNECTORS-1234: I agree about duplicating. I'll drop that. I just copied the content length check from Solr connector. Btw may I drop its check in Solr connector, too? Or keep its check for back compat? TikaExtractor based indexing on Elasticsearch connector --- Key: CONNECTORS-1234 URL: https://issues.apache.org/jira/browse/CONNECTORS-1234 Project: ManifoldCF Issue Type: Improvement Reporter: Shinichiro Abe Assignee: Shinichiro Abe Attachments: CONNECTORS-1234.patch We could add the use-mapper-attachments flag. Default to true, current spec which asks for mapper-attachments plugin on ES side. If false, it allows us to index the content and metadata that extracted from files through Tika transformer, which means there is no need to install that plugin and put base64 encoded content. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CONNECTORS-1234) TikaExtractor based indexing on Elasticsearch connector
Shinichiro Abe created CONNECTORS-1234: -- Summary: TikaExtractor based indexing on Elasticsearch connector Key: CONNECTORS-1234 URL: https://issues.apache.org/jira/browse/CONNECTORS-1234 Project: ManifoldCF Issue Type: Improvement Reporter: Shinichiro Abe Assignee: Shinichiro Abe We could add the use-mapper-attachments flag. Default to true, current spec which asks for mapper-attachments plugin on ES side. If false, it allows us to index the content and metadata that extracted from files through Tika transformer, which means there is no need to install that plugin and put base64 encoded content. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CONNECTORS-1234) TikaExtractor based indexing on Elasticsearch connector
[ https://issues.apache.org/jira/browse/CONNECTORS-1234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shinichiro Abe updated CONNECTORS-1234: --- Attachment: CONNECTORS-1234.patch Attached patch that tests passed. TikaExtractor based indexing on Elasticsearch connector --- Key: CONNECTORS-1234 URL: https://issues.apache.org/jira/browse/CONNECTORS-1234 Project: ManifoldCF Issue Type: Improvement Reporter: Shinichiro Abe Assignee: Shinichiro Abe Attachments: CONNECTORS-1234.patch We could add the use-mapper-attachments flag. Default to true, current spec which asks for mapper-attachments plugin on ES side. If false, it allows us to index the content and metadata that extracted from files through Tika transformer, which means there is no need to install that plugin and put base64 encoded content. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CONNECTORS-1231) Upgrade to JUnit 4.12
Shinichiro Abe created CONNECTORS-1231: -- Summary: Upgrade to JUnit 4.12 Key: CONNECTORS-1231 URL: https://issues.apache.org/jira/browse/CONNECTORS-1231 Project: ManifoldCF Issue Type: Test Reporter: Shinichiro Abe Assignee: Shinichiro Abe Priority: Minor Background: dev mail thread [link|http://mail-archives.apache.org/mod_mbox/manifoldcf-dev/201508.mbox/%3CCA%2BeTv_XTcU3bRTdS2GTi3apQCkJrzhnbS8oUZkcan%2BiW5goN8A%40mail.gmail.com%3E]. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CONNECTORS-1231) Upgrade to JUnit 4.12
[ https://issues.apache.org/jira/browse/CONNECTORS-1231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shinichiro Abe updated CONNECTORS-1231: --- Attachment: CONNECTORS-1231.patch A patch that all of the tests works fine without any problem. I'll commit it after ManifoldCF 2.2 release. Upgrade to JUnit 4.12 - Key: CONNECTORS-1231 URL: https://issues.apache.org/jira/browse/CONNECTORS-1231 Project: ManifoldCF Issue Type: Test Reporter: Shinichiro Abe Assignee: Shinichiro Abe Priority: Minor Attachments: CONNECTORS-1231.patch Background: dev mail thread [link|http://mail-archives.apache.org/mod_mbox/manifoldcf-dev/201508.mbox/%3CCA%2BeTv_XTcU3bRTdS2GTi3apQCkJrzhnbS8oUZkcan%2BiW5goN8A%40mail.gmail.com%3E]. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CONNECTORS-1230) Add writeLimit option on Tika extractor
Shinichiro Abe created CONNECTORS-1230: -- Summary: Add writeLimit option on Tika extractor Key: CONNECTORS-1230 URL: https://issues.apache.org/jira/browse/CONNECTORS-1230 Project: ManifoldCF Issue Type: Improvement Components: Tika extractor Reporter: Shinichiro Abe Assignee: Shinichiro Abe Priority: Minor To filter out documents that exceed specified maximum length of strings. Default to -1 for no limit, this is a current default behavior. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CONNECTORS-1230) Add writeLimit option on Tika extractor
[ https://issues.apache.org/jira/browse/CONNECTORS-1230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shinichiro Abe updated CONNECTORS-1230: --- Attachment: CONNECTORS-1230.patch a patch including TikaParser test. Add writeLimit option on Tika extractor --- Key: CONNECTORS-1230 URL: https://issues.apache.org/jira/browse/CONNECTORS-1230 Project: ManifoldCF Issue Type: Improvement Components: Tika extractor Reporter: Shinichiro Abe Assignee: Shinichiro Abe Priority: Minor Attachments: CONNECTORS-1230.patch To filter out documents that exceed specified maximum length of strings. Default to -1 for no limit, this is a current default behavior. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CONNECTORS-1230) Add writeLimit option on Tika extractor
[ https://issues.apache.org/jira/browse/CONNECTORS-1230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14707548#comment-14707548 ] Shinichiro Abe commented on CONNECTORS-1230: DocumentFilter length check is for byte array, byte length checking of files, OTOH, this check is for character array, character size checking against the content in files. Add writeLimit option on Tika extractor --- Key: CONNECTORS-1230 URL: https://issues.apache.org/jira/browse/CONNECTORS-1230 Project: ManifoldCF Issue Type: Improvement Components: Tika extractor Reporter: Shinichiro Abe Assignee: Shinichiro Abe Priority: Minor Attachments: CONNECTORS-1230.patch To filter out documents that exceed specified maximum length of strings. Default to -1 for no limit, this is a current default behavior. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CONNECTORS-1219) Lucene Output Connector
[ https://issues.apache.org/jira/browse/CONNECTORS-1219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14650673#comment-14650673 ] Shinichiro Abe commented on CONNECTORS-1219: r1693798 to the branch. The multiprocess mode works with hdfs indexes. I've tested zk and file processes example. The hdfs indexes have an index per a processId this time since an indexwriter works per a process, if I make indexwriters to index across processes, indexwriter throws LockObtainException. In this condition, removeDocument could not work properly because the connections don't know processId, know only documentURI. Please advice for me. Lucene Output Connector --- Key: CONNECTORS-1219 URL: https://issues.apache.org/jira/browse/CONNECTORS-1219 Project: ManifoldCF Issue Type: New Feature Reporter: Shinichiro Abe Assignee: Shinichiro Abe Attachments: CONNECTORS-1219-v0.1patch.patch, CONNECTORS-1219-v0.2.patch, CONNECTORS-1219-v0.3.patch A output connector for Lucene local index directly, not via remote search engine. It would be nice if we could use Lucene various API to the index directly, even though we could do the same thing to the Solr or Elasticsearch index. I assume we can do something to classification, categorization, and tagging, using e.g lucene-classification package. -- This message was sent by Atlassian JIRA (v6.3.4#6332)