Re: [MarkLogic Dev General] XHTML Transformation in Marklogic

Erik Hennum Tue, 30 Jul 2013 09:08:33 -0700

Hi, Sriram:

XSLT uses rules to match input based on limited XPath expressions, which is 
great for processing documents with large, complex XML vocabularies (regardless 
of whether the documents are large).  Some developers find the XML syntax 
verbose.


XQuery can branch based on element or type using typeswitch expressions.

IMHO, both are great, but that's really for development teams to decide.
 

Erik Hennum

________________________________________
From: [email protected] 
[[email protected]] on behalf of Gampa, Sriram 
[[email protected]]
Sent: Tuesday, July 30, 2013 7:32 AM
To: [email protected]
Subject: Re: [MarkLogic Dev General] XHTML Transformation in Marklogic

Hi Erik,

Thanks for your response. What is the difference between transformation using 
XQuery and transformation using XSLT? Which one is the best?

Thanks,

Sriram Gampa
Off: 407-345-2386 |Cell: 612-867-3232|email: [email protected]

-----Original Message-----
From: [email protected] 
[mailto:[email protected]] On Behalf Of 
[email protected]
Sent: Tuesday, July 30, 2013 11:09 AM
To: [email protected]
Subject: General Digest, Vol 109, Issue 50

Send General mailing list submissions to
        [email protected]

To subscribe or unsubscribe via the World Wide Web, visit
        http://developer.marklogic.com/mailman/listinfo/general
or, via email, send a message with subject or body 'help' to
        [email protected]

You can reach the person managing the list at
        [email protected]

When replying, please edit your Subject line so it is more specific than "Re: 
Contents of General digest..."


Today's Topics:

   1. Re: Registered Query Best Practices (Ron Hitchens)
   2. xdmp:http-put to REST Client API document management endpoint
      (Will Thompson)
   3. Re: xdmp:http-put to REST Client API document     management
      endpoint (Erik Hennum)
   4. XHTML Transformation in Marklogic (Gampa, Sriram)
   5. Re: XHTML Transformation in Marklogic (Erik Hennum)
   6. Re: Versioning of data (Singh, Gurbeer)


----------------------------------------------------------------------

Message: 1
Date: Tue, 30 Jul 2013 01:28:36 +0100
From: Ron Hitchens <[email protected]>
Subject: Re: [MarkLogic Dev General] Registered Query Best Practices
To: MarkLogic Developer Discussion <[email protected]>
Message-ID: <[email protected]>
Content-Type: text/plain; charset=us-ascii


Hi Geert,

   I've done something before where we stored reg ids in a map for easy re-use. 
 In that case, there was a 1:1 correspondence between the reg id and a 
meaningful business domain number.  On this project that's not the case.

   Also, there is not a finite set of queries that need to be registered so 
it's not feasible to pre-register everything once.  New ones can be created 
dynamically.  And the complicated queries are persisted in another database and 
can be referenced later.  This means the queries which should be registered 
will persist across server restarts.  Which means there must be a way to 
register the queries on first use, then make use of those registered queries on 
subsequent requests.

   The re-register-before-each-use pattern solves that nicely, but not if the 
query construction cost must be re-paid each time.  It looks like the robust 
solution is going to have to be catching exceptions for unregistered queries 
and reconstructing the registrations.  It's a shame because that is going to 
add unnecessary complexity to the code.

---
Ron Hitchens {mailto:[email protected]}   Ronsoft Technologies
     +44 7879 358 212 (voice)          http://www.ronsoft.com
     +1 707 924 3878 (fax)              Bit Twiddling At Its Finest
"No amount of belief establishes any fact." -Unknown


On Jul 29, 2013, at 8:15 PM, Geert Josten <[email protected]> wrote:

> Hi Ron,
>
> I recently saw a strategy where they deliberately took a different
> approach. In their case the calculation of the queries was not
> straight-forward and could run into 30k search terms. Additionally,
> registering the query, and warming up cache by doing one initial
> search after registering each query took most time. They were
> searching roughly 40mln docs. The searches themselves were subsec..
>
> Their approach was to store all registered query id's somewhere, and
> have them readily available at actual search time. They also used a
> try catch to catch unregistered queries, though in their case they
> shouldn't actually occur, and these dramatically pulled down the
> average on performance tests.
>
> How much chance is there that a query is unregistered, if you would
> prepare all queries beforehand?
>
> Cheers,
> Geert
>
>> -----Oorspronkelijk bericht-----
>> Van: [email protected] [mailto:general-
>> [email protected]] Namens Michael Blakeley
>> Verzonden: maandag 29 juli 2013 21:08
>> Aan: MarkLogic Developer Discussion
>> Onderwerp: Re: [MarkLogic Dev General] Registered Query Best
>> Practices
>>
>> I think you're using registered query as intended. That behavior
>> sounds
> odd
>> to me. I would expect (2) to be cheap, just a hash operation on the
> query
>> terms, and I would (3) to be the expensive step.
>>
>> So I would contact support and see what they think.
>>
>> -- Mike
>>
>> On 29 Jul 2013, at 11:03 , Ron Hitchens <[email protected]> wrote:
>>
>>>
>>>  What is the best practice these days for using registered queries?
>>> I was under the impression that the pattern should be:
>>>
>>> 1) Create your query:
>>>   $query := cts:and-query ((blah blah blah))
>>> 2) Register it and make a registered query from it in one step:
>>>   $reg-query := cts:resistered-query (cts:register ($query),
> "unfiltered")
>>> 3) Use it in a search:
>>>   cts:search (fn:doc(), $reg-query)
>>>
>>>  The theory being that if the cts:query described by $query is
>>> already registered, then the registration is essentially a no-op and
>>> you'll get back the same ID.  And doing this every time insures that
>>> if the registered query has been evicted for some reason then it's
>>> re-registered and all is well.
>>>
>>>  It's a nice theory but seems to be based on the assumption that
>>> creating a cts:query object is very cheap.  Unfortunately, I'm
>>> finding that this is often not the case, especially when there are
>>> lots of documents in the database.  I have a test case where
>>> performing Step 2 above on a moderately complicated query takes
>>> roughly 200ms every
>> time.
>>> Others take even longer and all seem to be proportional to database
> size.
>>> But running Step 3 with cts:registered-query(<regid>) is very, very
>>> fast (~0ms).  Re-creating the query for re-registering every time is
>>> destroying the benefit of using a registered query.
>>>
>>>  I can obviously save the registration ID obtained from calling
>>> cts:register and then make a cts:registered-query each time, but
>>> then I'm not protected from the query becoming unregistered.  And
>>> there is no lightweight way to test if an ID is still registered.
>>> The only way I know to make this robust is to put a loop and
>>> try/catch around the code that does the search.  But that requires
>>> passing along enough context to re-construct and re-register the
>>> queries (there can be dozens of them in this case).  This is
>>> obviously a lot harder than building the complex query in one module
>>> and then passing it along to the search code somewhere else.
>>>
>>>  What's the generally accepted best usage pattern for registered
>>> queries?  And is it my imagination or has the cost of running
>>> queries been moving from query evaluation into query construction?
>>>
>>>  Thanks.
>>>
>>> ---
>>> Ron Hitchens {mailto:[email protected]}   Ronsoft Technologies
>>>    +44 7879 358 212 (voice)          http://www.ronsoft.com
>>>    +1 707 924 3878 (fax)              Bit Twiddling At Its Finest
>>> "No amount of belief establishes any fact." -Unknown
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> General mailing list
>>> [email protected]
>>> http://developer.marklogic.com/mailman/listinfo/general
>>>
>>
>> _______________________________________________
>> General mailing list
>> [email protected]
>> http://developer.marklogic.com/mailman/listinfo/general
> _______________________________________________
> General mailing list
> [email protected]
> http://developer.marklogic.com/mailman/listinfo/general



------------------------------

Message: 2
Date: Tue, 30 Jul 2013 00:29:37 +0000
From: Will Thompson <[email protected]>
Subject: [MarkLogic Dev General] xdmp:http-put to REST Client API
        document        management endpoint
To: MarkLogic Discussion <[email protected]>
Message-ID: <ce1c56ff.10999%[email protected]>
Content-Type: text/plain; charset="us-ascii"

Is there another way to do this, or do I have to use xdmp:quote?

xdmp:http-put(
  'https://localhost:8012/v1/documents?uri=/mydoc.xml',
  <options xmlns="xdmp:http">
    <authentication method="digest">
      <username>user</username>
      <password>pass</password>
    </authentication>
    <data>{ xdmp:quote(<doc>My document</doc>) }</data>
  </options>
)


0Will



------------------------------

Message: 3
Date: Tue, 30 Jul 2013 03:56:37 +0000
From: Erik Hennum <[email protected]>
Subject: Re: [MarkLogic Dev General] xdmp:http-put to REST Client API
        document        management endpoint
To: MarkLogic Developer Discussion <[email protected]>
Message-ID:
        <dfdf2fd50bf5aa42adaf93ff2e3ca1850ac...@exchg10-be01.marklogic.com>
Content-Type: text/plain; charset="us-ascii"

Hi, Will:

You're right -- at present, the caller to xdmp:http-post() and xdmp:http-put() 
has to serialize the XML.

Have you run into issues using xdmp:quote() to serialize?


Erik Hennum
________________________________________
From: [email protected] 
[[email protected]] on behalf of Will Thompson 
[[email protected]]
Sent: Monday, July 29, 2013 5:29 PM
To: MarkLogic Discussion
Subject: [MarkLogic Dev General] xdmp:http-put to REST Client API document      
management endpoint

Is there another way to do this, or do I have to use xdmp:quote?

xdmp:http-put(
  'https://localhost:8012/v1/documents?uri=/mydoc.xml',
  <options xmlns="xdmp:http">
    <authentication method="digest">
      <username>user</username>
      <password>pass</password>
    </authentication>
    <data>{ xdmp:quote(<doc>My document</doc>) }</data>
  </options>
)


0Will

_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general


------------------------------

Message: 4
Date: Tue, 30 Jul 2013 13:43:11 +0000
From: "Gampa, Sriram" <[email protected]>
Subject: [MarkLogic Dev General] XHTML Transformation in Marklogic
To: "[email protected]"
        <[email protected]>
Message-ID:
        <b4dcb4350e007a489f0fc00ed55af9f78eff9...@hmhandmbx03.ex.pubedu.hegn.us>

Content-Type: text/plain; charset="windows-1252"

Hi All,

I am trying to implement XHTML transformation in my code . Just want know what  
are the different ways of for implementing XHTML transformations in Marklogic? 
Can I use an object( either map consisting key value pair or xml object) for 
XHTML transformation?

Thanks,

Sriram Gampa
Off: 407-345-2386 |Cell: 612-867-3232|email: 
[email protected]<mailto:[email protected]>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
http://developer.marklogic.com/pipermail/general/attachments/20130730/ad5cb3f0/attachment-0001.html

------------------------------

Message: 5
Date: Tue, 30 Jul 2013 13:51:45 +0000
From: Erik Hennum <[email protected]>
Subject: Re: [MarkLogic Dev General] XHTML Transformation in Marklogic
To: MarkLogic Developer Discussion <[email protected]>
Message-ID:
        <dfdf2fd50bf5aa42adaf93ff2e3ca1850ac...@exchg10-be01.marklogic.com>
Content-Type: text/plain; charset="iso-8859-1"

Hi, Sriram:

The input can be any data structure in the XQuery Data Model that your 
transform expects.

Typically, a transform takes an XML structure as input, but there's no reason a 
transform couldn't iterate over the keys of a map or the items of a sequence or 
array.

Transforms can be written in XQuery or XSLT.


Hoping that helps,


Erik Hennum

________________________________
From: [email protected] 
[[email protected]] on behalf of Gampa, Sriram 
[[email protected]]
Sent: Tuesday, July 30, 2013 6:43 AM
To: [email protected]
Subject: [MarkLogic Dev General] XHTML Transformation in Marklogic

Hi All,

I am trying to implement XHTML transformation in my code . Just want know what  
are the different ways of for implementing XHTML transformations in Marklogic? 
Can I use an object( either map consisting key value pair or xml object) for 
XHTML transformation?

Thanks,

Sriram Gampa
Off: 407-345-2386 |Cell: 612-867-3232|email: 
[email protected]<mailto:[email protected]>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
http://developer.marklogic.com/pipermail/general/attachments/20130730/cb63a5ab/attachment-0001.html

------------------------------

Message: 6
Date: Tue, 30 Jul 2013 14:06:31 +0000
From: "Singh, Gurbeer" <[email protected]>
Subject: Re: [MarkLogic Dev General] Versioning of data
To: MarkLogic Developer Discussion <[email protected]>
Message-ID:
        <[email protected]>
Content-Type: text/plain; charset="us-ascii"

Thanks for the link, looks like if we use Library service, we can't use CPF. 
Let me explain my requirement.
We are managing firm policies, basically they are document [pdf,docx]. Each 
document contains some metadata [title, location, division....], so we have a 
form where user upload his document and add its metadata.
We call ML rest API where we pass this [metadata +document]. A XQY is written 
which will insert this request into a folder. ML pipeline is attached to this 
folder, which will covert this document into XHTML and appends metadata.
Problem is every time it overwrites the document, if policy already exists. 
Mainly we need to maintain version of these metadata for audit purpose.  Could 
you suggest us something?
One way I am planning to implement in this way We will create two folders, one 
[Version] and one [Policy]. [Version] will contain metadata and [Policy] will 
contain metadata with document XML.
So we will have one XQY which will get the request form app and will create a 
XML file in [Version] folder and every time appends metadata in this XML, next 
step will insert latest metadata in [Policy] folder. A ML pipeline will be 
attached in [Policy] folder which will create XHTML

[cid:[email protected]]

Sample XML we will save in Version folder <root SlnID=123>
                <v_4>
                <title></title>
                                ....
                                ...
                                ...
                <filepath>//msad/NA/UID/filename</filepath>
                </updatedby></updatedby>
                </v_4>
..
..             <v_1>
                <title></title>
                                ....
                                ...
                                ...
                <filepath>//msad/NA/UID/filename</filepath>
              </updatedby></updatedby>
                </v_1>
              <v_0>
                <title></title>
                                ....
                                ...
                                ...
                <filepath>//msad/NA/UID/filename</filepath>
                </updatedby></updatedby>
                </v_0>
</root>

From: [email protected] 
[mailto:[email protected]] On Behalf Of Danny Sokolsky
Sent: Wednesday, July 24, 2013 5:16 PM
To: MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] Versioning of data

Hi Gurbeer,

You can look at the Document Library Services (DLS) API:

http://docs.marklogic.com/guide/app-dev/dls
http://docs.marklogic.com/dls

DLS allows you to build check-in, check-out kinds of applications.

-Danny

From: 
[email protected]<mailto:[email protected]>
 [mailto:[email protected]] On Behalf Of Singh, Gurbeer
Sent: Wednesday, July 24, 2013 2:00 PM
To: [email protected]<mailto:[email protected]>
Subject: [MarkLogic Dev General] Versioning of data


I want to explore ML for Versioning of data, when we submit a document it get 
saved as XML, if we make any edit or update, we overwrite existing XML, we want 
to stop this and enable versioning, so that we can track history of changes. Is 
there way to enable this functionality?
Also help me to understand how we can retrieve latest XML and history of 
changes.

~Gurbeer



________________________________

NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions or 
views contained herein are not intended to be, and do not constitute, advice 
within the meaning of Section 975 of the Dodd-Frank Wall Street Reform and 
Consumer Protection Act. If you have received this communication in error, 
please destroy all electronic and paper copies and notify the sender 
immediately. Mistransmission is not intended to waive confidentiality or 
privilege. Morgan Stanley reserves the right, to the extent permitted under 
applicable law, to monitor electronic communications. This message is subject 
to terms available at the following link: 
http://www.morganstanley.com/disclaimers If you cannot access these links, 
please notify us by reply message and we will send the contents to you. By 
messaging with Morgan Stanley you consent to the foregoing.


________________________________

NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions or 
views contained herein are not intended to be, and do not constitute, advice 
within the meaning of Section 975 of the Dodd-Frank Wall Street Reform and 
Consumer Protection Act. If you have received this communication in error, 
please destroy all electronic and paper copies and notify the sender 
immediately. Mistransmission is not intended to waive confidentiality or 
privilege. Morgan Stanley reserves the right, to the extent permitted under 
applicable law, to monitor electronic communications. This message is subject 
to terms available at the following link: 
http://www.morganstanley.com/disclaimers If you cannot access these links, 
please notify us by reply message and we will send the contents to you. By 
messaging with Morgan Stanley you consent to the foregoing.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
http://developer.marklogic.com/pipermail/general/attachments/20130730/f90506c0/attachment.html
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image003.png
Type: image/png
Size: 6348 bytes
Desc: image003.png
Url : 
http://developer.marklogic.com/pipermail/general/attachments/20130730/f90506c0/attachment.png

------------------------------

_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general


End of General Digest, Vol 109, Issue 50
****************************************


_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Re: [MarkLogic Dev General] XHTML Transformation in Marklogic

Reply via email to