Hey Rishi,
The filemgr connection from the pushpull is just to verify if the filemgr already has a file, so the pushpull doesn't redownload files (no ingest support)... usually you configure your pushpull deamon to run at longer interval times, but the crawler usually will wake up more often (every 30 seconds is a typical interval time for it)... so just have the pushpull download its files to a staging area which is the same directory which the crawler is monitoring.
-brian
The filemgr connection from the pushpull is just to verify if the filemgr already has a file, so the pushpull doesn't redownload files (no ingest support)... usually you configure your pushpull deamon to run at longer interval times, but the crawler usually will wake up more often (every 30 seconds is a typical interval time for it)... so just have the pushpull download its files to a staging area which is the same directory which the crawler is monitoring.
-brian
On Nov 09, 2012, at 11:06 AM, "Verma, Rishi (388J)" <[email protected]> wrote:
Hey Brian, Shreyl,Thanks for your input and clarification on this.Brian - the delegation of duties you described makes sense. Does cas-puspull have any way to invoke a local crawl process following completion of downloads? I know it has a filemgr hookup, but I wonder about whether a crawl process can be invoked following the completion of all file downloads via pushpull. The alternative way of doing this could, of course, be to schedule the crawler deamon to run well after the pushpull deamon finishes its work.Thanks to both of you for your help!rishiOn Nov 9, 2012, at 10:08 AM, Brian Foster wrote:
Hey Rishi,
You will need to use both cas-pushpull and cas-crawler to accomplish this...
cas-pushpull: Used to for downloading files from remote sites to you local systems... the .tmp files contain cas-pushpull's known metadata and you can configure which of the known metadata gets written out or if a .tmp file gets created at all... however you can add custom metadata fields to it.
cas-crawler: Allows for metadata extraction (custom metadata) from files on your local system... and then allows you to ingest them into the filemgr (optionally can be turned off)
HTH
-brian
On Nov 08, 2012, at 06:11 PM, "Verma, Rishi (388J)" <[email protected]> wrote:
Hi All -I'm wondering if anyone has experience with, or knows the details of how to use custom MetExtractors on products that are remotely downloaded via PushPull.By default, PushPull performs some basic met-extraction and creates a ".tmp" file associated with downloaded products, but I'm wondering whether this met generation step is customizable.I've looked through the configuration files (e.g. [1], [2]) as well as the code for PushPull, but I can't seem to locate configuration parameters to support the invocation of custom met extractors on downloaded data.If any of you have experience with this, or can point me on where to look, I'd really appreciate it.Thanks!Rishi--
