Hmm.... On second thought that approach won't work, because the input documents 
are binary. The content module interface only handles XML. I believe this is 
the first time anyone has asked for binary support.

You could try patching recordloader/xcc/XccModuleContent.java to support this. 
I don't think XCC can handle setting binary nodes as external variables for a 
request object. But what should work is to convert the binary node into a 
Base64-encoded string. Then the module would have to convert it back to binary, 
of course.

It might be a good idea to extend the content module interface with a new 
variable too, so that the module knows what the input document-type is. 
Strictly speaking that isn't necessary because the document-type is fixed for a 
single invocation of RecordLoader, but it still seems like the right thing to 
do.

-- Mike

On 19 Apr 2013, at 13:46 , Michael Blakeley <[email protected]> wrote:

> It sounds like you are looking for this:
> 
> CONTENT_FACTORY_CLASSNAME=com.marklogic.recordloader.xcc.XccModuleContentFactory
> CONTENT_MODULE_URI=my-code-module.xqy
> 
> Note that the content module has to conform to a strict interface. See 
> http://marklogic.github.io/recordloader/ for details and sample code.
> 
> -- Mike
> 
> On 19 Apr 2013, at 12:59 , Mohanraj Chozhan <[email protected]> 
> wrote:
> 
>> Thank you very much its worked Brilliantly.
>> 
>> Also I need how to inject the xqy files to write my business logic to move 
>> the documents into ML Directory from the record loader.
>> 
>> 
>> -----Original Message-----
>> From: [email protected] 
>> [mailto:[email protected]] On Behalf Of Michael 
>> Blakeley
>> Sent: Saturday, April 20, 2013 1:00 AM
>> To: MarkLogic Developer Discussion
>> Subject: Re: [MarkLogic Dev General] Record Loader - not able to load binary 
>> files (pdf, images)
>> 
>> I asked for *full* logs: you've left out much of the interesting setup and 
>> configuration.
>> 
>> However there is just enough here to see the problem: INPUT_PATTERN isn't 
>> set, so it's using the default value, which only matching *.xml filenames. 
>> From http://marklogic.github.io/recordloader/
>> 
>> Property     default value   notes
>> INPUT_PATTERN        ^.+\\.[Xx][Mm][Ll]$     Matching pattern (regex) for 
>> files found in INPUT_PATH. The default value matches all filenames ending 
>> with .xml
>> 
>> RecordLoader isn't using your files because it's only looking for filenames 
>> that match the regex '^.+\\.[Xx][Mm][Ll]$'. Try something like 
>> INPUT_PATTERN=.+\\.(PDF|JPG|pdf|jpg)$ instead.
>> 
>> -- Mike
>> 
>> On 19 Apr 2013, at 12:24 , Mohanraj Chozhan <[email protected]> 
>> wrote:
>> 
>>> Recordloader logs
>>> 
>>> Apr 20, 2013 12:51:06 AM 
>>> com.marklogic.recordloader.DefaultInputHandler configureInputs
>>> INFO: adding D:/test/
>>> Apr 20, 2013 12:51:06 AM com.marklogic.recordloader.LoaderFactory 
>>> <init>
>>> INFO: Loader is com.marklogic.recordloader.FileLoader
>>> Apr 20, 2013 12:51:06 AM 
>>> com.marklogic.recordloader.DefaultInputHandler run
>>> INFO: populating queue
>>> Apr 20, 2013 12:51:06 AM 
>>> com.marklogic.recordloader.DefaultInputHandler run
>>> INFO: queued 0 loader(s)
>>> Apr 20, 2013 12:51:06 AM com.marklogic.recordloader.Monitor halt
>>> INFO: halting
>>> Apr 20, 2013 12:51:06 AM com.marklogic.recordloader.Monitor run
>>> INFO: loaded 0 records ok (0 B in 0.5075154 s, 0 tps, 0 kB/s), with 0 
>>> error(s)
>>> 
>>> 
>>> In the D:/test  test directory I have pdf and jpg files but showing 0 files 
>>>  only.
>>> 
>>> 
>>> 
>>> -----Original Message-----
>>> From: [email protected] 
>>> [mailto:[email protected]] On Behalf Of Michael 
>>> Blakeley
>>> Sent: Saturday, April 20, 2013 12:36 AM
>>> To: MarkLogic Developer Discussion
>>> Subject: Re: [MarkLogic Dev General] Record Loader - not able to load 
>>> binary files (pdf, images)
>>> 
>>> If you can provide the full RecordLoader logs we may be able to diagnose 
>>> the problem.
>>> 
>>> One potential problem I see is that your CONNECTION_STRING doesn't seem to 
>>> have a username or password.
>>> 
>>> -- Mike
>>> 
>>> On 19 Apr 2013, at 11:36 , Mohanraj Chozhan <[email protected]> 
>>> wrote:
>>> 
>>>> Which version of MarkLogic are you using?
>>>> We are using ML6.
>>>> 
>>>> Which version of RecordLoader?
>>>> 
>>>> recordloader.jar - We can't find the version xpp3-1.1.3_8.jar
>>>> 
>>>> What's the exact command you're issuing to invoke the JAR?
>>>> 
>>>> We are using the below code to execute the recordloader
>>>> 
>>>> String[] args = { "resources/ recordloader.properties " };
>>>>             try {
>>>>                    RecordLoader.main(args);
>>>>             } catch (Exception e) {
>>>>                    throw new MarocMLDBException("Uploading bulk files 
>>>> unseccessful", e);
>>>>             }
>>>> recordloader.properties
>>>> 
>>>> CONNECTION_STRING=xcc://localhost:8100/test
>>>> INPUT_PATH=D:/test
>>>> OUTPUT_COLLECTIONS=wikipedia
>>>> DOCMENT_TYPE=binary
>>>> URI_PREFIX=/FR/
>>>> 
>>>> What's the error message you're getting?
>>>> XML files are getting load into ML repository.
>>>> 
>>>> Can you provide the MLCP Sample working , We tried and unable to use it.
>>>> 
>>>> That's reason we are using chosen recordloader.
>>>> 
>>>> It will more helpful if we get the sample of MLCP to use.
>>>> 
>>>> Thanks in advance
>>>> 
>>>> Mohanraj
>>>> 
>>>> From: [email protected]
>>>> [mailto:[email protected]] On Behalf Of Justin 
>>>> Makeig
>>>> Sent: Friday, April 19, 2013 11:49 PM
>>>> To: MarkLogic Developer Discussion
>>>> Subject: Re: [MarkLogic Dev General] Record Loader - not able to 
>>>> load binary files (pdf, images)
>>>> 
>>>> Which version of MarkLogic are you using? Which version of RecordLoader? 
>>>> What's the exact command you're issuing to invoke the JAR? What's the 
>>>> error message you're getting? If you're using MarkLogic 6, you might also 
>>>> take a look at mlcp 
>>>> <http://docs.marklogic.com/guide/ingestion/content-pump>.
>>>> 
>>>> Justin
>>>> 
>>>> Justin Makeig
>>>> Director, Product Management
>>>> MarkLogic Corporation
>>>> [email protected]
>>>> www.marklogic.com
>>>> 
>>>> 
>>>> 
>>>> 
>>>> On Apr 19, 2013, at 11:11 AM, Mohanraj Chozhan 
>>>> <[email protected]>
>>>> wrote:
>>>> 
>>>> 
>>>> Hi,
>>>> 
>>>> I am using the ML record Loader. But unable to load the binary files pdf, 
>>>> images into ML repository.
>>>> 
>>>> In the record loader properties file we added
>>>> 
>>>> DOCUMENT_TYPE=binary
>>>> 
>>>> But still facing the issue to load it.
>>>> 
>>>> Can someone help me out on this.
>>>> 
>>>> Regards,
>>>> Mohanraj
>>>> **************** CAUTION - Disclaimer ***************** This e-mail 
>>>> contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely for 
>>>> the use of the addressee(s). If you are not the intended recipient, 
>>>> please notify the sender by e-mail and delete the original message.
>>>> Further, you are not to copy, disclose, or distribute this e-mail or 
>>>> its contents to any other person and any such actions are unlawful.
>>>> This e-mail may contain viruses. Infosys has taken every reasonable 
>>>> precaution to minimize this risk, but is not liable for any damage 
>>>> you may sustain as a result of any virus in this e-mail. You should 
>>>> carry out your own virus checks before opening the e-mail or attachment.
>>>> Infosys reserves the right to monitor and review the content of all 
>>>> messages sent to or from this e-mail address. Messages sent to or from 
>>>> this e-mail address may be stored on the Infosys e-mail system.
>>>> ***INFOSYS******** End of Disclaimer ********INFOSYS*** 
>>>> _______________________________________________
>>>> General mailing list
>>>> [email protected]
>>>> http://developer.marklogic.com/mailman/listinfo/general
>>>> 
>>>> _______________________________________________
>>>> General mailing list
>>>> [email protected]
>>>> http://developer.marklogic.com/mailman/listinfo/general
>>> 
>>> _______________________________________________
>>> General mailing list
>>> [email protected]
>>> http://developer.marklogic.com/mailman/listinfo/general
>>> _______________________________________________
>>> General mailing list
>>> [email protected]
>>> http://developer.marklogic.com/mailman/listinfo/general
>> 
>> _______________________________________________
>> General mailing list
>> [email protected]
>> http://developer.marklogic.com/mailman/listinfo/general
>> _______________________________________________
>> General mailing list
>> [email protected]
>> http://developer.marklogic.com/mailman/listinfo/general
>> 
> 

_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to