Hi,


When we are trying to upload pdf and image its working perfectly in ML5.

Same I am trying to upload into ML6 it's not uploading. Throwing document is 
not UTF-8 encoded.



RecordLoader Properties file



CONNECTION_STRING=xcc://test:test@localhost:8010/test

INPUT_STRIP_PREFIX=^[A-Z]:

INPUT_NORMALIZE_PATHS=true

INPUT_PATH=D:/test/

OUTPUT_COLLECTIONS=wikipedia

URI_PREFIX=/

INPUT_PATTERN=.+\\.(PDF|JPG|pdf|jpg|JPEG|jpeg)$



Now getting this error:



Apr 23, 2013 7:08:50 PM com.marklogic.ps.SimpleLogger logException

SEVERE: com.marklogic.recordloader.LoaderException:  
/test/D:\test\extractedFiles\test.jpg

com.marklogic.xcc.exceptions.XQueryException: XDMP-DOCUTF8SEQ: Invalid UTF-8 
escape sequence at / /test/D:\test\extractedFiles\APP_APP1102.jpg line 1 -- 
document is not UTF-8 encoded

[Client: XCC/6.0-2, Server: XDBC/6.0-2.2]

                at 
com.marklogic.xcc.impl.handlers.ServerExceptionHandler.handleResponse(ServerExceptionHandler.java:34)

                at 
com.marklogic.xcc.impl.handlers.ContentInsertController.serverDialog(ContentInsertController.java:139)

                at 
com.marklogic.xcc.impl.handlers.AbstractRequestController.runRequest(AbstractRequestController.java:84)

                at 
com.marklogic.xcc.impl.SessionImpl.insertContent(SessionImpl.java:309)

                at 
com.marklogic.xcc.impl.SessionImpl.insertContent(SessionImpl.java:274)

                at 
com.marklogic.xcc.impl.SessionImpl.insertContent(SessionImpl.java:338)

                at 
com.marklogic.recordloader.xcc.XccContent.insert(XccContent.java:74)

                at 
com.marklogic.recordloader.AbstractLoader.insert(AbstractLoader.java:326)

                at 
com.marklogic.recordloader.FileLoader.process(FileLoader.java:60)

                at 
com.marklogic.recordloader.AbstractLoader.call(AbstractLoader.java:96)

                at 
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)

                at java.util.concurrent.FutureTask.run(FutureTask.java:138)

                at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)

                at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)

                at java.lang.Thread.run(Thread.java:662)

Apr 23, 2013 7:08:50 PM com.marklogic.recordloader.Monitor halt

INFO: halting

Apr 23, 2013 7:08:50 PM com.marklogic.recordloader.Monitor run

INFO: loaded 1 records ok (15831 B in 1.868895839 s, 1 tps, 8 kB/s), with 0 
error(s)



-----Original Message-----
From: [email protected] 
[mailto:[email protected]] On Behalf Of Michael Blakeley
Sent: Saturday, April 20, 2013 1:00 AM
To: MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] Record Loader - not able to load binary 
files (pdf, images)



I asked for *full* logs: you've left out much of the interesting setup and 
configuration.



However there is just enough here to see the problem: INPUT_PATTERN isn't set, 
so it's using the default value, which only matching *.xml filenames. From 
http://marklogic.github.io/recordloader/



Property              default value     notes

INPUT_PATTERN              ^.+\\.[Xx][Mm][Ll]$       Matching pattern (regex) 
for files found in INPUT_PATH. The default value matches all filenames ending 
with .xml



RecordLoader isn't using your files because it's only looking for filenames 
that match the regex '^.+\\.[Xx][Mm][Ll]$'. Try something like 
INPUT_PATTERN=.+\\.(PDF|JPG|pdf|jpg)$ instead.



-- Mike



On 19 Apr 2013, at 12:24 , Mohanraj Chozhan 
<[email protected]<mailto:[email protected]>> wrote:



> Recordloader logs

>

> Apr 20, 2013 12:51:06 AM

> com.marklogic.recordloader.DefaultInputHandler configureInputs

> INFO: adding D:/test/

> Apr 20, 2013 12:51:06 AM com.marklogic.recordloader.LoaderFactory

> <init>

> INFO: Loader is com.marklogic.recordloader.FileLoader

> Apr 20, 2013 12:51:06 AM

> com.marklogic.recordloader.DefaultInputHandler run

> INFO: populating queue

> Apr 20, 2013 12:51:06 AM

> com.marklogic.recordloader.DefaultInputHandler run

> INFO: queued 0 loader(s)

> Apr 20, 2013 12:51:06 AM com.marklogic.recordloader.Monitor halt

> INFO: halting

> Apr 20, 2013 12:51:06 AM com.marklogic.recordloader.Monitor run

> INFO: loaded 0 records ok (0 B in 0.5075154 s, 0 tps, 0 kB/s), with 0

> error(s)

>

>

> In the D:/test  test directory I have pdf and jpg files but showing 0 files  
> only.

>

>

>

> -----Original Message-----

> From: 
> [email protected]<mailto:[email protected]>

> [mailto:[email protected]] On Behalf Of Michael

> Blakeley

> Sent: Saturday, April 20, 2013 12:36 AM

> To: MarkLogic Developer Discussion

> Subject: Re: [MarkLogic Dev General] Record Loader - not able to load

> binary files (pdf, images)

>

> If you can provide the full RecordLoader logs we may be able to diagnose the 
> problem.

>

> One potential problem I see is that your CONNECTION_STRING doesn't seem to 
> have a username or password.

>

> -- Mike

>

> On 19 Apr 2013, at 11:36 , Mohanraj Chozhan 
> <[email protected]<mailto:[email protected]>> wrote:

>

> > Which version of MarkLogic are you using?

> > We are using ML6.

> >

> > Which version of RecordLoader?

> >

> > recordloader.jar - We can't find the version xpp3-1.1.3_8.jar

> >

> > What's the exact command you're issuing to invoke the JAR?

> >

> > We are using the below code to execute the recordloader

> >

> > String[] args = { "resources/ recordloader.properties " };

> >               try {

> >                      RecordLoader.main(args);

> >               } catch (Exception e) {

> >                      throw new MarocMLDBException("Uploading bulk files 
> > unseccessful", e);

> >               }

> > recordloader.properties

> >

> > CONNECTION_STRING=xcc://localhost:8100/test

> > INPUT_PATH=D:/test

> > OUTPUT_COLLECTIONS=wikipedia

> > DOCMENT_TYPE=binary

> > URI_PREFIX=/FR/

> >

> > What's the error message you're getting?

> > XML files are getting load into ML repository.

> >

> > Can you provide the MLCP Sample working , We tried and unable to use it.

> >

> > That's reason we are using chosen recordloader.

> >

> > It will more helpful if we get the sample of MLCP to use.

> >

> > Thanks in advance

> >

> > Mohanraj

> >

> > From: 
> > [email protected]<mailto:[email protected]>

> > [mailto:[email protected]] On Behalf Of Justin

> > Makeig

> > Sent: Friday, April 19, 2013 11:49 PM

> > To: MarkLogic Developer Discussion

> > Subject: Re: [MarkLogic Dev General] Record Loader - not able to

> > load binary files (pdf, images)

> >

> > Which version of MarkLogic are you using? Which version of RecordLoader? 
> > What's the exact command you're issuing to invoke the JAR? What's the error 
> > message you're getting? If you're using MarkLogic 6, you might also take a 
> > look at mlcp <http://docs.marklogic.com/guide/ingestion/content-pump>.

> >

> > Justin

> >

> > Justin Makeig

> > Director, Product Management

> > MarkLogic Corporation

> > [email protected]<mailto:[email protected]>

> > www.marklogic.com<http://www.marklogic.com>

> >

> >

> >

> >

> > On Apr 19, 2013, at 11:11 AM, Mohanraj Chozhan

> > <[email protected]<mailto:[email protected]>>

> >  wrote:

> >

> >

> > Hi,

> >

> > I am using the ML record Loader. But unable to load the binary files pdf, 
> > images into ML repository.

> >

> > In the record loader properties file we added

> >

> > DOCUMENT_TYPE=binary

> >

> > But still facing the issue to load it.

> >

> > Can someone help me out on this.

> >

> > Regards,

> > Mohanraj

> > **************** CAUTION - Disclaimer ***************** This e-mail

> > contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely for

> > the use of the addressee(s). If you are not the intended recipient,

> > please notify the sender by e-mail and delete the original message.

> > Further, you are not to copy, disclose, or distribute this e-mail or

> > its contents to any other person and any such actions are unlawful.

> > This e-mail may contain viruses. Infosys has taken every reasonable

> > precaution to minimize this risk, but is not liable for any damage

> > you may sustain as a result of any virus in this e-mail. You should

> > carry out your own virus checks before opening the e-mail or attachment.

> > Infosys reserves the right to monitor and review the content of all

> > messages sent to or from this e-mail address. Messages sent to or from this 
> > e-mail address may be stored on the Infosys e-mail system.

> > ***INFOSYS******** End of Disclaimer ********INFOSYS***

> > _______________________________________________

> > General mailing list

> > [email protected]<mailto:[email protected]>

> > http://developer.marklogic.com/mailman/listinfo/general

> >

> > _______________________________________________

> > General mailing list

> > [email protected]<mailto:[email protected]>

> > http://developer.marklogic.com/mailman/listinfo/general

>

> _______________________________________________

> General mailing list

> [email protected]<mailto:[email protected]>

> http://developer.marklogic.com/mailman/listinfo/general

> _______________________________________________

> General mailing list

> [email protected]<mailto:[email protected]>

> http://developer.marklogic.com/mailman/listinfo/general



_______________________________________________

General mailing list

[email protected]<mailto:[email protected]>

http://developer.marklogic.com/mailman/listinfo/general
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to