Ok, I added a bit of extra info to the null output connector document ingestion simple history logging on trunk. This extra info summarizes attributes and their counts. I then created a job, with an attribute and fixed value, and left all other defaults in place. The output looks like this:
>>>>>> 02-22-2017 17:49:18.851 document ingest (null) file:/C:/wip/mcf/trunk/README.txt OK 4970 1 "myAttribute":1,"uri":1 02-22-2017 17:49:18.850 document ingest (null) file:/C:/wip/mcf/trunk/Livelink.patch OK 2201 1 "myAttribute":1,"uri":1 02-22-2017 17:49:18.840 document ingest (null) file:/C:/wip/mcf/trunk/lib/c3p0-0.9.1.1.jar OK 608376 1 "myAttribute":1,"uri":1 02-22-2017 17:49:18.830 document ingest (null) file:/C:/wip/mcf/trunk/build.xml.svnpatch.rej OK 515 1 "myAttribute":1,"uri":1 <<<<<< It clearly picks up the attribute I injected, and indeed another one that comes from the file system repository connector too. Then I unchecked both checkboxes in the metadata adjuster stage, and ran some more documents. This is the result: >>>>>> 02-22-2017 17:54:31.020 document ingest (null) file:/C:/wip/mcf/trunk/dist/connector-common-lib/jna-4.1.0.jar OK 914597 1 "myAttribute":1 02-22-2017 17:54:30.985 document ingest (null) file:/C:/wip/mcf/trunk/lib/google-http-client-jackson2-1.19.0... .jar OK 6720 1 "myAttribute":1 <<<<<< As you can see, it continued to inject the attribute, but now it no longer passes through the upstream attribute. This is working as designed. So it seems clear that the issue must be related to either the Solr output connector, or to the Solr configuration. If you are using MCF 2.4, that does *not* have the SolrJ 6.x version you will need to work with Solr 6.x. That may well be where the trouble lies. Please upgrade to MCF 2.6 to rule out that possibility. If that does not fix the issue, then I will bring one of our resident Solr experts into the conversation. Thanks, Karl On Wed, Feb 22, 2017 at 11:51 AM, Karl Wright <[email protected]> wrote: > Ah, sorry once again. It is definitely the update/extract handler in the > log entry you sent. > > I am quite busy at the moment and will review this evening further. > > Thanks, > Karl > > > On Wed, Feb 22, 2017 at 11:21 AM, Karl Wright <[email protected]> wrote: > >> Hi Marisol, >> >> The [INFO] log statement you sent earlier was not an /update/extract >> request, and your Solr connection is set up to send to the Solr Cell >> /update/extract endpoint. Can you look again in your logs and find the >> *right* [INFO] statement? Thanks!! >> >> Karl >> >> >> On Wed, Feb 22, 2017 at 10:52 AM, Marisol Redondo < >> [email protected]> wrote: >> >>> I have formatted it so you have all the information >>> >>> Name: Sites solr dev Description: sites core in solr dev >>> ________________________________________ >>> Connection type: Solr Max connections: 10 >>> ________________________________________ >>> Parameters: >>> User ID= >>> ZooKeeper znode path= >>> Socket timeout=900 >>> Server remove handler=/update >>> Included mime types= >>> Use extract update handler=true >>> Solr created date field name= >>> ZooKeeper client timeout=60 >>> Solr modified date field name= >>> Solr core name=sites >>> Server protocol=http >>> Realm= >>> Server name=solrdev >>> Server status handler=/admin/ping >>> Password=******** >>> Excluded mime types= >>> Commits=true >>> Maximum document length= >>> Server port=8983 >>> Connection timeout=60 >>> Solr type=standard >>> Solr filename field name= >>> Commit within= >>> Solr id field name=id >>> Solr mime type field name= >>> ZooKeeper connect timeout=60 >>> Collection=collection1 >>> Server update handler=/update/extract >>> Server web application=solr >>> Solr original size field name= >>> Solr indexed date field name= >>> Solr content field name= >>> ZooKeeper hosts: Host Port: >>> localhost 2181 >>> >>> Arguments: Name Value >>> No arguments >>> >>> ________________________________________ >>> Connection status: Connection working >>> >>> >>> >>> >> >
