Dave,

 I've tried to do a little more investigating on the box where I'm trying to 
get Imagemagick tied in. magick:wand() does not work in Qconsole or CQ/DQ. I 
did not see the plugin loaded in the log at all either. Can you verify which 
version you are running of marklogic/os. I've got Server 5.0-4 x86_64 on Red 
Hat Enterprise Server 6.3. Also, I have the express key on it if that's an 
issue on why it might not be trying to load.

Thanks,
Tim


-----Original Message-----
From: [email protected] 
[mailto:[email protected]] On Behalf Of 
[email protected]
Sent: Wednesday, April 03, 2013 12:22 PM
To: [email protected]
Subject: General Digest, Vol 106, Issue 5

Send General mailing list submissions to
        [email protected]

To subscribe or unsubscribe via the World Wide Web, visit
        http://developer.marklogic.com/mailman/listinfo/general
or, via email, send a message with subject or body 'help' to
        [email protected]

You can reach the person managing the list at
        [email protected]

When replying, please edit your Subject line so it is more specific than "Re: 
Contents of General digest..."


Today's Topics:

   1. Re: ImageMagick Plugin (Dave Cassel)


----------------------------------------------------------------------

Message: 1
Date: Wed, 3 Apr 2013 16:06:04 +0000
From: Dave Cassel <[email protected]>
Subject: Re: [MarkLogic Dev General] ImageMagick Plugin
To: MarkLogic Developer Discussion <[email protected]>
Message-ID:
        <6a86cfbdec4a0340b4e0138c5e90039707c...@exchg10-be02.marklogic.com>
Content-Type: text/plain; charset="us-ascii"

Tim, you actually won't need to declare the magick namespace (no pun intended). 
On the server where I have this set up, I can run:

magick:wand()

in QC.

David Cassel
Vanguard Technical Manager
MarkLogic<http://www.marklogic.com>


From: Timothy Pearce <[email protected]<mailto:[email protected]>>
Reply-To: MarkLogic Developer Discussion 
<[email protected]<mailto:[email protected]>>
Date: Monday, April 1, 2013 4:19 PM
To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Subject: Re: [MarkLogic Dev General] ImageMagick Plugin

Thank you Dave for your reply,

That info line that your outlined does not show on my startup log.

ImageMagick is installed:

Version: ImageMagick 6.8.4-2 2013-03-27 Q16 http://www.imagemagick.org
Copyright: Copyright (C) 1999-2013 ImageMagick Studio LLC
Features: DPC OpenMP Modules
Delegates: bzlib djvu fftw fontconfig freetype gslib jng jp2 jpeg lcms lzma 
openexr png ps rsvg tiff wmf x xml zlib

For the startup log, outlining that this is a Red Hat Enterprise Server 6.3:

2013-04-01 14:49:10.774 Notice: Starting MarkLogic Server 5.0-4 x86_64 in 
/opt/MarkLogic with data in /var/opt/MarkLogic
2013-04-01 14:49:10.919 Info: Host rhel.mobile1 running Linux 
2.6.32-279.11.1.el6.x86_64 (Red Hat Enterprise Linux Server release 6.3 
(Santiago))
2013-04-01 14:49:17.018 Info: Mounted forest App-Services locally on 
/var/opt/MarkLogic/Forests/App-Services
2013-04-01 14:49:17.083 Info: Mounted forest Security locally on 
/var/opt/MarkLogic/Forests/Security
2013-04-01 14:49:17.272 Info: Mounted forest DEMO-Modules locally on 
/mldata/Marklogic/DEV/Forests/DEMO-Modules
2013-04-01 14:49:17.283 Info: Mounted forest Triggers locally on 
/var/opt/MarkLogic/Forests/Triggers
2013-04-01 14:49:17.439 Info: Mounted forest DEMO-Forest1 locally on 
/mldata/Marklogic/DEV/Forests/DEMO-Forest1
2013-04-01 14:49:17.708 Info: Mounted forest Fab locally on 
/var/opt/MarkLogic/Forests/Fab
2013-04-01 14:49:17.746 Info: Mounted forest Documents locally on 
/var/opt/MarkLogic/Forests/Documents
2013-04-01 14:49:17.776 Info: Mounted forest Modules locally on 
/var/opt/MarkLogic/Forests/Modules
2013-04-01 14:49:17.810 Info: Mounted forest Schemas locally on 
/var/opt/MarkLogic/Forests/Schemas
2013-04-01 14:49:17.869 Info: Mounted forest Last-Login locally on 
/var/opt/MarkLogic/Forests/Last-Login
2013-04-01 14:49:23.077 Info: Linux Huge Pages: detected 0, recommend 320 to 
1025
2013-04-01 14:49:25.524 Notice: TaskServer: App-Services: CPF processing 
restart on host rhel.mobile1
2013-04-01 14:49:25.524 Notice: TaskServer: Fab: CPF processing restart on host 
rhel.mobile1
2013-04-01 14:49:26.698 Notice: TaskServer: Fab: CPF processing restart done on 
host rhel.mobile1
2013-04-01 14:49:26.726 Notice: TaskServer: App-Services: CPF processing 
restart done on host rhel.mobile1

Showing the MagickBuiltins.so is in place:

$ ls /opt/MarkLogic/lib/
BasisEntityExtractor.so  libjemalloc.so.1  list.txt  MagickBuiltins.so

Is there any other pieces of configuration I'm missing that isn't loading it 
automatically? The one thing I've noticed was the missing namespace declaration 
line in your blog post, could that help shed some light on what I'm missing?

Thanks,
Tim Pearce


-----Original Message-----
From: 
[email protected]<mailto:[email protected]>
 [mailto:[email protected]] On Behalf Of 
[email protected]<mailto:[email protected]>
Sent: Thursday, March 28, 2013 8:38 PM
To: [email protected]<mailto:[email protected]>
Subject: General Digest, Vol 105, Issue 66

Send General mailing list submissions to
        [email protected]<mailto:[email protected]>

To subscribe or unsubscribe via the World Wide Web, visit
        http://developer.marklogic.com/mailman/listinfo/general
or, via email, send a message with subject or body 'help' to
        
[email protected]<mailto:[email protected]>

You can reach the person managing the list at
        
[email protected]<mailto:[email protected]>

When replying, please edit your Subject line so it is more specific than "Re: 
Contents of General digest..."


Today's Topics:

   1. Re: [MarkLogic Dev General]       Reprocessing    non-UTF8        ingested
      records/elements (Treskon, Matthew)


----------------------------------------------------------------------

Message: 1
Date: Fri, 29 Mar 2013 00:25:44 +0000
From: "Treskon, Matthew" 
<[email protected]<mailto:[email protected]>>
Subject: Re: [MarkLogic Dev General]    Reprocessing    non-UTF8        ingested
        records/elements
To: MarkLogic Developer Discussion 
<[email protected]<mailto:[email protected]>>
Message-ID:
        
<b0204ce234cff341a2e881b599fb284804116...@001fsn2mpn1-024.001f.mgd2.msft.net<mailto:b0204ce234cff341a2e881b599fb284804116...@001fsn2mpn1-024.001f.mgd2.msft.net>>

Content-Type: text/plain; charset="utf-8"

Thanks David. I?ll revisit the in-house process. If the error was introduced 
during whatever processing the provider does, then your sketch will be helpful.


--Matthew



From: 
[email protected]<mailto:[email protected]>
 [mailto:[email protected]] On Behalf Of David Lee
Sent: Thursday, March 28, 2013 5:54 PM
To: MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] Reprocessing non-UTF8 ingested 
records/elements

I have a hard time imagining that the data started out as ascii escapes ... 
something in the transition must have messed this up.
So I suggest looking  at the *process* by which your "source database" ended up 
in MarkLogic.  It just doesnt make sense.

But ... if there is NO OTHER WAY ...
you have to parse this as text and create new text.

The fn:tokenize function might be a place to start

https://docs.marklogic.com/fn:tokenize

This allows you to split up text into an array of strings like

let $strs := fn:tokenize( $element , "\\x<file:///\\x>" )

Now you  will have a sequence like ("FA" , "AC" , "B5" )

You can iterate through those and parse them as hex 
https://docs.marklogic.com/xdmp:hex-to-integer

Now you end up with binary values in a sequenc ... but STILL have to turn them 
from UTF8 (if that is it) into unicode.
That will require knowing how UTF (or whatever sequence you are dealing with) 
does things ...
Thats a pain.    you dont want to go there ...
but its possible.
http://en.wikipedia.org/wiki/UTF8

Once you create a sequence of unicode codepoints you can convert back to a 
string using http://docs.marklogic.com/fn:codepoints-to-string

Maybe there is a better way ...

But I would encourage you to look back at your process of HOW the data ended up 
in marklogic like this.
It would be vastly easier to fix that then after "A fence on the hill or an 
ambulance down in the valley"

http://www.wealthandwant.com/docs/Malins_ambulance.html


-----------------------------------------------------------------------------
David Lee
Lead Engineer
MarkLogic Corporation
[email protected]<mailto:[email protected]><mailto:[email protected]>
Phone: +1 812-482-5224
Cell:  +1 812-630-7622
www.marklogic.com<http://www.marklogic.com/>

From: 
[email protected]<mailto:[email protected]><mailto:[email protected]>
 [mailto:[email protected]] On Behalf Of Treskon, Matthew
Sent: Thursday, March 28, 2013 5:44 PM
To: MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] Reprocessing non-UTF8 ingested 
records/elements

Short answer: I got the data that way from a source database. I?ll try talking 
with the provider but that may not be successful.

Plan B: your sketch would be much appreciated.


Thanks,
Matthew



From: 
[email protected]<mailto:[email protected]><mailto:[email protected]>
 [mailto:[email protected]] On Behalf Of David Lee
Sent: Thursday, March 28, 2013 5:38 PM
To: MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] Reprocessing non-UTF8 ingested 
records/elements

First question ... to punt on your question.
How did these escape chars end up in your data instead of the real data ?
The BEST solution is to get the data in correctly in the first place.
Now that the data is in this form ... its going to be painful.
Possible, but painful.

You will have to do text parsing on the data using whatever you know about 
encoding to get it into something real then create a new document.

There is no built-in methods to parse this kind of data ... it CAN be done but 
it will take work ...
If you really need it done I can help sketch out a plan, but the better 
solution is "dont do that" ...
Can you find out how this bogus data got in there in the first place and fix it 
from there ?


-----------------------------------------------------------------------------
David Lee
Lead Engineer
MarkLogic Corporation
[email protected]<mailto:[email protected]><mailto:[email protected]>
Phone: +1 812-482-5224
Cell:  +1 812-630-7622
www.marklogic.com<http://www.marklogic.com/>

From: 
[email protected]<mailto:[email protected]><mailto:[email protected]>
 [mailto:[email protected]] On Behalf Of Treskon, Matthew
Sent: Thursday, March 28, 2013 5:31 PM
To: MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] Reprocessing non-UTF8 ingested 
records/elements

David,


1)      I?m using xml:lang attribute to ?guess? encoding ? if xml:lang="Tr" 
then cp-1256.

2)      I stripped the xmlns. Yes the characters are just ascii representations 
of hex

3)      I mistakenly used different elements for the perl code and the xml 
element: here is the corresponding element:
<ml:title source=" " datetime="" xmlns:ml=" 
">\xCF\xEE\xE2\xFB\xF8\xE5\xED\xE8\xE5 
\xFD\xF4\xF4\xE5\xEA\xF2\xE8\xE2\xED\xEE\xF1\xF2\xE8 
\xF3\xEF\xF0\xE0\xE2\xEB\xE5\xED\xE8\xFF \xE2 
\xEA\xEE\xEE\xEF\xE5\xF0\xE0\xF2\xE8\xE2\xED\xEE-\xE8\xED\xF2\xE5\xE3\xF0\xE0\xF6\xE8\xEE\xED\xED\xFB\xF5
 \xF1\xF2\xF0\xF3\xEA\xF2\xF3\xF0\xE0\xF5 \xEF\xF3\xF2\xE5\xEC 
\xE2\xED\xE5\xE4\xF0\xE5\xED\xE8\xFF \xF1\xE8\xF1\xF2\xE5\xEC\xFB 
\xE1\xFE\xE4\xE6\xE5\xF2\xE8\xF0\xEE\xE2\xE0\xED\xE8\xFF</ml:title>

4)      So how do I translate these ASCII literal escape sequences?

Thanks,
Matthew



From: 
[email protected]<mailto:[email protected]><mailto:[email protected]>
 [mailto:[email protected]] On Behalf Of David Lee
Sent: Thursday, March 28, 2013 5:18 PM
To: MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] Reprocessing non-UTF8 ingested 
records/elements

It would help if you explain the problem a little better.
A few issues that come to mind

1) xml:lang has nothing to do with encoding, so what is the expectation there ?

2) The sample doc is not encoded in anything except ascii:  (I assume the xmlns 
is bogus) <ml:title source=" " datetime="" xmlns:ml=" ">\xD2\xE5\xE ...

Those are literal ascii characters  "\"  "x" "D" "2" "\" "x" "E" "5" etc That 
has nothing at all to do with encoding.

3) Your perl code is using  PERL escape sequences which have nothing to do with 
the data in your sample XML.

4) Encoding is a property of a file *outside* of the XML data model.   If a 
file is imported in the wrong encoding it will be garbage
and cant be re-encoded ... But if the file is like you say above, its not that 
its badly encoded ... its containing literal escape sequences as ASCII which is 
an entirely different problem.




-----------------------------------------------------------------------------
David Lee
Lead Engineer
MarkLogic Corporation
[email protected]<mailto:[email protected]><mailto:[email protected]>
Phone: +1 812-482-5224
Cell:  +1 812-630-7622
www.marklogic.com<http://www.marklogic.com/>

From: 
[email protected]<mailto:[email protected]><mailto:[email protected]>
 [mailto:[email protected]] On Behalf Of Treskon, Matthew
Sent: Thursday, March 28, 2013 5:12 PM
To: 
[email protected]<mailto:[email protected]><mailto:[email protected]>
Subject: [MarkLogic Dev General] Reprocessing non-UTF8 ingested records/elements

Somewhere ?down the line?, a few records that have been ingested into an ML 
database have embedded hex code from a different encoding scheme than the 
native UTF8 (such as cp-1251, cp-1256):

<ml:title source=" " datetime="" xmlns:ml=" 
">\xD2\xE5\xED\xE4\xE5\xED\xF6\xE8\xE8 \xE8 
\xE7\xE0\xEA\xEE\xED\xEE\xEC\xE5\xF0\xED\xEE\xF1\xF2\xE8 
\xE2\xE5\xE4\xE5\xED\xE8\xFF \xFD\xEA\xEE\xEB\xEE\xE3\xE8\xF7\xE5\xF1\xEA\xE8 
\xE1\xE5\xE7\xEE\xEF\xE0\xF1\xED\xEE\xE3\xEE 
\xEF\xF0\xEE\xE8\xE7\xE2\xEE\xE4\xF1\xF2\xE2\xE0 
\xF1\xE5\xEB\xFC\xF1\xEA\xEE\xF5\xEE\xE7\xFF\xE9\xF1\xF2\xE2\xE5\xED\xED\xEE\xE9
 \xEF\xF0\xEE\xE4\xF3\xEA\xF6\xE8\xE8</ml:title>

Once these records have been identified and encoding scheme determined 
(xml:lang is present in sibling elements), how do I reprocess (i.e. say ?input? 
is cp-1251, output utf8)? I can see xdmp:document-load has an encoding option, 
but I?d hope there is a better way to handle this than export then reimport.

I?m not sure if this helps clarify. I can do this in PERL:

======
use strict;
use warnings;
require "Encode.pm";

binmode STDOUT, ":encoding(utf-8)";

my $string = "\xCF\xEE\xE2\xFB\xF8\xE5\xED\xE8\xE5 
\xFD\xF4\xF4\xE5\xEA\xF2\xE8\xE2\xED\xEE\xF1\xF2\xE8 
\xF3\xEF\xF0\xE0\xE2\xEB\xE5\xED\xE8\xFF \xE2 
\xEA\xEE\xEE\xEF\xE5\xF0\xE0\xF2\xE8\xE2\xED\xEE-\xE8\xED\xF2\xE5\xE3\xF0\xE0\xF6\xE8\xEE\xED\xED\xFB\xF5
 \xF1\xF2\xF0\xF3\xEA\xF2\xF3\xF0\xE0\xF5 \xEF\xF3\xF2\xE5\xEC 
\xE2\xED\xE5\xE4\xF0\xE5\xED\xE8\xFF \xF1\xE8\xF1\xF2\xE5\xEC\xFB 
\xE1\xFE\xE4\xE6\xE5\xF2\xE8\xF0\xEE\xE2\xE0\xED\xE8\xFF";

print Encode::decode("cp-1251",$string);

--> ????????? ????????????? ?????????? ? ????????????-??????????????
--> ?????????? ????? ????????? ??????? ??????????????<--

======


Thank you,
--Matthew Treskon







This electronic message contains information generated by the USDA solely for 
the intended recipients. Any unauthorized interception of this message or the 
use or disclosure of the information it contains may violate the law and 
subject the violator to civil or criminal penalties. If you believe you have 
received this message in error, please notify the sender and delete the email 
immediately.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
http://developer.marklogic.com/pipermail/general/attachments/20130329/a02a3acc/attachment.html

------------------------------

_______________________________________________
General mailing list
[email protected]<mailto:[email protected]>
http://developer.marklogic.com/mailman/listinfo/general


End of General Digest, Vol 105, Issue 66
****************************************


Nothing in this message is intended to constitute an electronic signature 
unless a specific statement to the contrary is included in this message. 
Confidentiality Note: This message is intended only for the person or entity to 
which it is addressed. It may contain confidential and/or proprietary material. 
Any review, transmission, dissemination or other use, or taking of any action 
in reliance upon this message by persons or entities other than the intended 
recipient is prohibited. If you received this message in error, please contact 
the sender and delete it from your computer.
_______________________________________________
General mailing list
[email protected]<mailto:[email protected]>
http://developer.marklogic.com/mailman/listinfo/general

-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
http://developer.marklogic.com/pipermail/general/attachments/20130403/c42b85f3/attachment.html

------------------------------

_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general


End of General Digest, Vol 106, Issue 5
***************************************


Nothing in this message is intended to constitute an electronic signature 
unless a specific statement to the contrary is included in this message. 
Confidentiality Note: This message is intended only for the person or entity to 
which it is addressed. It may contain confidential and/or proprietary material. 
Any review, transmission, dissemination or other use, or taking of any action 
in reliance upon this message by persons or entities other than the intended 
recipient is prohibited. If you received this message in error, please contact 
the sender and delete it from your computer.
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to