Hi Karl, sorry for my english :). I mean the fact that i've to extract value from query with a join between two table with a relationship of one-to-many, the dataset returned from Connector is only one pair from the two table.
For example: Table A with persons Table B with eyes As result of join, i aspect have two row like: person 1, eye left person 1, eye right but the connector returns only one row: person 1, eye left I hope now it's more clear. Ps. i report the phrase on Manifold documentation that explain that ( https://manifoldcf.apache.org/release/release-2.3/en_US/end-user-documentation.html#jdbcrepository ): ------ There is currently no support in the JDBC connection type for natively handling multi-valued metadata. ------ Thanks, L. Alicata 2016-05-06 15:10 GMT+02:00 Karl Wright <[email protected]>: > Hi Luca, > > It is not clear what you mean by "multi value extraction" using the JDBC > connector. The JDBC connector allows collection of primary binary content > as well as metadata from a database row. So maybe if you can explain what > you need beyond that it would help. > > Thanks, > Karl > > > On Fri, May 6, 2016 at 9:04 AM, Luca Alicata <[email protected]> > wrote: > >> Hi Karl, >> thanks for information, fortunately in other jboss instance i have a old >> Manifold configuration with single process, that i've dismissed. But in >> this moment, i start to test this jobs with that and if it work fine, i can >> use it only for this job and use it also in production. Maybe after, if i >> can, i try to check the possible problem that stop the agent. >> >> I Take advantage of this discussion to ask you, if multi-value extraction >> from db is consider as possible future work or no. Because i've used this >> generi connector to resolve this lack of JDBC Connector. In fact with >> Manifold 1.8 i've modified the connector to support this behavior (in >> addiction to parse blob file), but upgrade Manifold Version, to not rewrite >> the new connector i decide to use Generic Connector with application that >> do the work of extraction data from DB. >> >> Thanks, >> L. Alicata >> >> 2016-05-06 14:42 GMT+02:00 Karl Wright <[email protected]>: >> >>> Hi Luca, >>> >>> If you do a lock clean and the process still stops, then the locks are >>> not the problem. >>> >>> One way we can drill down into the problem is to get a thread dump of >>> the agents process after it stops. The thread dump must be of the agents >>> process, not any of the others. >>> >>> FWIW, the generic connector is not well supported; the person who wrote >>> it is still a committer but is not actively involved in MCF development at >>> this time. I suspect that the problem may have to do with how that >>> connector deals with exceptions or errors, but I am not sure. >>> >>> Thanks, >>> >>> Karl >>> >>> >>> On Fri, May 6, 2016 at 8:38 AM, Luca Alicata <[email protected]> >>> wrote: >>> >>>> Hi Karl, >>>> I've just tried with lock-clean after agents stop to work, obviously >>>> after stopping process. After this, job start correctly, but just second >>>> time that i start a job with a lot of data (or sometimes the third time), >>>> agent stop again. >>>> >>>> Unfortunately, it's difficult start, for the moment, to using Zookeeper >>>> in this environment, but this can resolve the fact that during working >>>> agents stop to work? or help only for cleaning lock agent when i restart >>>> the process? >>>> >>>> Thanks, >>>> L. Alicata >>>> >>>> 2016-05-06 14:15 GMT+02:00 Karl Wright <[email protected]>: >>>> >>>>> Hi Luca, >>>>> >>>>> With file-based synchronization, if you kill any of the processes >>>>> involved, you will need to execute the lock-clean procedure to make sure >>>>> you have no dangling locks in the file system. >>>>> >>>>> - shut down all MCF processes (except the database) >>>>> - run the lock-clean script >>>>> - start your MCF processes back up >>>>> >>>>> I suspect what you are seeing is related to this. >>>>> >>>>> Also, please consider using Zookeeper instead, since it is more robust >>>>> about cleaning out dangling locks. >>>>> >>>>> Thanks, >>>>> Karl >>>>> >>>>> >>>>> On Fri, May 6, 2016 at 8:06 AM, Luca Alicata <[email protected]> >>>>> wrote: >>>>> >>>>>> Hi Karl, >>>>>> thanks for help. >>>>>> In my case i've only one instance of MCF running, with both type of >>>>>> job (SP and Generic), and so i have only one properties files (that i >>>>>> have >>>>>> attached). >>>>>> For information i used (multiprocess-file configuration) with >>>>>> postgres. >>>>>> >>>>>> Do you have other suggestions? do you need more information, that i >>>>>> can give you? >>>>>> >>>>>> Thanks, >>>>>> >>>>>> L.Alicata >>>>>> >>>>>> 2016-05-06 12:55 GMT+02:00 Karl Wright <[email protected]>: >>>>>> >>>>>>> Hi Luca, >>>>>>> >>>>>>> Do you have multiple independent MCF clusters running at the same >>>>>>> time? It sounds like you do: you have SP on one, and Generic on >>>>>>> another. >>>>>>> If so, you will need to be sure that the synchronization you are using >>>>>>> (either zookeeper or file-based) does not overlap. Each cluster needs >>>>>>> its >>>>>>> own synchronization. If there is overlap, then doing things with one >>>>>>> cluster may cause the other cluster to hang. This also means you have >>>>>>> to >>>>>>> have different properties files for the two clusters, of course. >>>>>>> >>>>>>> Thanks, >>>>>>> Karl >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Fri, May 6, 2016 at 4:32 AM, Luca Alicata <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> Hi, >>>>>>>> i'm using Manifold 2.2 with multi-process configuration in Jboss >>>>>>>> instance inside a Windows Server 2012 and i've a set of job that work >>>>>>>> with >>>>>>>> Sharepoint (SP) or Generic Connector (GC), that get file from a db. >>>>>>>> With SP i've no problem, while with GC with a lot of document (one >>>>>>>> with 47k and another with 60k), the Seed taking process, sometimes, not >>>>>>>> finish, because the agents seem to stop (although java process is still >>>>>>>> alive). >>>>>>>> After this, if i try to start any other job, that not start, like >>>>>>>> the agents are stopped. >>>>>>>> >>>>>>>> Other times, this jobs work correctly and one time together work >>>>>>>> correctly, running in the same moment. >>>>>>>> >>>>>>>> For information: >>>>>>>> >>>>>>>> - On Jboss there are only Manifold and Generic Repository >>>>>>>> application. >>>>>>>> >>>>>>>> >>>>>>>> - On the same Virtual Server, there is another Jboss istance, >>>>>>>> with solr istance and a web application. >>>>>>>> >>>>>>>> >>>>>>>> - I've check if it was a type of memory problem, but it's not >>>>>>>> the case. >>>>>>>> >>>>>>>> >>>>>>>> - GC with almost 23k seed work always, at least in test that >>>>>>>> i've done. >>>>>>>> >>>>>>>> >>>>>>>> - In local instance of Jboss with Manifold and Generic >>>>>>>> Rpository Application, i've not keep this problem. >>>>>>>> >>>>>>>> This is the only recurrent information that i've seen on >>>>>>>> manifold.log: >>>>>>>> --------------- >>>>>>>> Connection 0.0.0.0:62755<-><ip-address>:<port> shut down >>>>>>>> Releasing connection >>>>>>>> org.apache.http.impl.conn.ManagedClientConnectionImpl@6c98c1bd >>>>>>>> >>>>>>>> --------------- >>>>>>>> >>>>>>>> Thanks, >>>>>>>> L. Alicata >>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >
