If you've updated the DBInterfaceMySQL driver, any chance you would be willing to contribute it back to the project?
Karl On Sun, Dec 4, 2011 at 11:13 PM, Hitoshi Ozawa <[email protected]> wrote: > "The interpretation of this field will differ from connector to connector". > From the above description, seems the content of entityid is dependent of > which connector is > being used to crawl the web pages. > You're right about the second point on entityid column datatype. In MySQL, > which I'm using > with ManifoldCF, the datatype of entityid is LONGTEXT. I was just using it > figurably even though > I just found out that I can actually execute the sql statement. :-) > > Cheers, > H.Ozawa > > > (2011/12/05 10:29), Karl Wright wrote: >> >> Well, the history comes from the repohistory table, yes - but you may >> not be able to construct a query with entityid=jobs.id, first of all >> because that is incorrect (what the entity field contains is dependent >> on the activity type), and secondly because that column is >> potentially long and only some kinds of queries can be done against >> it. Specifically it cannot be built into an index on PostgreSQL. >> >> Karl >> >> On Sun, Dec 4, 2011 at 7:50 PM, Hitoshi Ozawa >> <[email protected]> wrote: >> >>> >>> Is "history" just entries in the "repohistory" table with entitityid = >>> jobs.id? >>> >>> H.Ozawa >>> >>> (2011/12/03 1:43), Karl Wright wrote: >>> >>>> >>>> The best place to get this from is the simple history. A command-line >>>> utility to dump this information to a text file should be possible >>>> with the currently available interface primitives. If that is how you >>>> want to go, you will need to run ManifoldCF in multiprocess mode. >>>> Alternatively you might want to request the info from the API, but >>>> that's problematic because nobody has implemented report support in >>>> the API as of now. >>>> >>>> A final alternative is to get this from the log. There is an [INFO] >>>> level line from the web connector for every fetch, I seem to recall, >>>> and you might be able to use that. >>>> >>>> Thanks, >>>> Karl >>>> >>>> >>>> On Fri, Dec 2, 2011 at 11:18 AM, M Kelleher<[email protected]> >>>> wrote: >>>> >>>> >>>>> >>>>> Is it possible to export / download the list of URLs visited during a >>>>> crawl job? >>>>> >>>>> Sent from my iPad >>>>> >>>>> >>>> >>>> >>>> >>> >>> >>> >>> >> >> > > >
