Re: [basex-talk] Performance related query.
Hi Ankit, have you already compared the query info outoput? Best, Christian On Fri, Mar 27, 2015 at 10:35 AM, ankit kumar anky4b...@gmail.com wrote: Hi, I am getting performance issue while using my own xquery library. I have written an xquery module which contains a single function which return all the categories belong to a set of products as given below. Also there is one constrain that i cannot pass all the category to the getCategory() function as argument. I have to take only products as input. I tried to take /products/p:category directly instead of $lib:category variable. but it is showing root not found. So i have to defined it as global variable. =XQUERY MODULEproduct_library.xq module namespace lib = product_library; declare namespace p=a:b:c; declare variable $lib:category := /products/p:category; declare function lib:getCategory($products){ let $catRefs := distinct-values($products/@catid) return $lib:category[@id = $catRefs] }; Then i am including this library in another file, where i am invoking getCategory() function of the module. My code for that is given below. ==Client Code == product_client.xq import module 'product_library' at 'file:///C:/Users/ankumar/Desktop/product_library.xq'; declare namespace lib =product_library; declare namespace p=a:b:c; let $products := /products/*[@catid] return count(lib:getCategory($products)) Executing the above code is taking too long. So, I stopped that and write the same logic of getCategory() function of module in the same Client file as given below. =Changed Client Code== import module 'product_library' at 'file:///C:/Users/ankumar/Desktop/product_library.xq'; declare namespace lib =product_library; declare namespace p=a:b:c; let $products := /products/*[@catid] (: return count(lib:getCategory($products)) :) let $catRefs := distinct-values($products/@catid) return count(/products/p:category[@id = $catRefs]) It is executing very fast, and giving me the desired result with a second. I don't know why it is happening. My whole module is written in that way only. If you have any idea, why it is happening and how can i make this efficient then share with me. I am attaching my module file, client file,the xml instance file and the query info for with module and without module file with the mail. Is there anything to do with query optimization ?. large.xml https://docs.google.com/file/d/0B_pB7l14skhVMkxzVjZpRWJ3blU/edit?usp=drive_web
Re: [basex-talk] Performance related query.
Hi, Optimized query without module import count(db:attribute(large_products, distinct-values(db:open-pre(large_products,0)/products/*[@catid]/@catid))/self::id/parent::p:category[parent::products/parent::document-node()]) optimized query with module import count(let $catRefs_4 := distinct-values(db:open-pre(large_products,0)/products/*[@catid]/@catid) return ((db:open-pre(large_products,2), ...))[(@id = $catRefs_4)]) I may be wrong but as per my understanding performance difference is due db:open-pre. Since both the query are same only difference is former creates static variable in module file and latter creates local variable. Thanks Ankit On 27 March 2015 at 15:10, Christian Grün christian.gr...@gmail.com wrote: Hi Ankit, have you already compared the query info outoput? Best, Christian On Fri, Mar 27, 2015 at 10:35 AM, ankit kumar anky4b...@gmail.com wrote: Hi, I am getting performance issue while using my own xquery library. I have written an xquery module which contains a single function which return all the categories belong to a set of products as given below. Also there is one constrain that i cannot pass all the category to the getCategory() function as argument. I have to take only products as input. I tried to take /products/p:category directly instead of $lib:category variable. but it is showing root not found. So i have to defined it as global variable. =XQUERY MODULEproduct_library.xq module namespace lib = product_library; declare namespace p=a:b:c; declare variable $lib:category := /products/p:category; declare function lib:getCategory($products){ let $catRefs := distinct-values($products/@catid) return $lib:category[@id = $catRefs] }; Then i am including this library in another file, where i am invoking getCategory() function of the module. My code for that is given below. ==Client Code == product_client.xq import module 'product_library' at 'file:///C:/Users/ankumar/Desktop/product_library.xq'; declare namespace lib =product_library; declare namespace p=a:b:c; let $products := /products/*[@catid] return count(lib:getCategory($products)) Executing the above code is taking too long. So, I stopped that and write the same logic of getCategory() function of module in the same Client file as given below. =Changed Client Code== import module 'product_library' at 'file:///C:/Users/ankumar/Desktop/product_library.xq'; declare namespace lib =product_library; declare namespace p=a:b:c; let $products := /products/*[@catid] (: return count(lib:getCategory($products)) :) let $catRefs := distinct-values($products/@catid) return count(/products/p:category[@id = $catRefs]) It is executing very fast, and giving me the desired result with a second. I don't know why it is happening. My whole module is written in that way only. If you have any idea, why it is happening and how can i make this efficient then share with me. I am attaching my module file, client file,the xml instance file and the query info for with module and without module file with the mail. Is there anything to do with query optimization ?. large.xml https://docs.google.com/file/d/0B_pB7l14skhVMkxzVjZpRWJ3blU/edit?usp=drive_web
[basex-talk] Performance related query.
Hi, I am getting performance issue while using my own xquery library. I have written an xquery module which contains a single function which return all the categories belong to a set of products as given below. Also there is one constrain that i cannot pass all the category to the getCategory() function as argument. I have to take only products as input. I tried to take /products/p:category directly instead of $lib:category variable. but it is showing root not found. So i have to defined it as global variable. =XQUERY MODULEproduct_library.xq module namespace lib = product_library; declare namespace p=a:b:c; declare variable $lib:category := /products/p:category; declare function lib:getCategory($products){ let $catRefs := distinct-values($products/@catid) return $lib:category[@id = $catRefs] }; Then i am including this library in another file, where i am invoking getCategory() function of the module. My code for that is given below. ==Client Code == product_client.xq import module 'product_library' at 'file:///C:/Users/ankumar/Desktop/product_library.xq'; declare namespace lib =product_library; declare namespace p=a:b:c; let $products := /products/*[@catid] return count(lib:getCategory($products)) Executing the above code is taking too long. So, I stopped that and write the same logic of getCategory() function of module in the same Client file as given below. =Changed Client Code== import module 'product_library' at 'file:///C:/Users/ankumar/Desktop/product_library.xq'; declare namespace lib =product_library; declare namespace p=a:b:c; let $products := /products/*[@catid] (: return count(lib:getCategory($products)) :) let $catRefs := distinct-values($products/@catid) return count(/products/p:category[@id = $catRefs]) It is executing very fast, and giving me the desired result with a second. I don't know why it is happening. My whole module is written in that way only. If you have any idea, why it is happening and how can i make this efficient then share with me. I am attaching my module file, client file,the xml instance file and the query info for with module and without module file with the mail. Is there anything to do with query optimization ?. large.xml https://docs.google.com/file/d/0B_pB7l14skhVMkxzVjZpRWJ3blU/edit?usp=drive_web Error: Interrupted. Compiling: - inlining Q{product_library}getCategory#1 - inlining $products_3 - simplifying flwor expression - inlining $products_2 - simplifying flwor expression Query: import module 'product_library' at 'file:///C:/Users/ankumar/Desktop/product_library.xq'; declare namespace lib =product_library; declare namespace p=a:b:c; let $products := /products/*[@catid] return count(lib:getCategory($products)) Optimized Query: count(let $catRefs_4 := distinct-values(db:open-pre(large_products,0)/products/*[@catid]/@catid) return ((db:open-pre(large_products,2), ...))[(@id = $catRefs_4)]) Query plan: QueryPlan compiled=true FnCount name=count(items) GFLWOR Let Var name=$catRefs id=4/ FnDistinctValues name=distinct-values(items[,collation]) IterPath DBNode name=large_products pre=0/ IterStep axis=child test=products/ IterStep axis=child test=* CachedPath IterStep axis=attribute test=catid/ /CachedPath /IterStep IterStep axis=attribute test=catid/ /IterPath /FnDistinctValues /Let IterFilter ItemSeq size=30 DBNode name=large_products pre=2/ DBNode name=large_products pre=13/ DBNode name=large_products pre=24/ DBNode name=large_products pre=35/ DBNode name=large_products pre=46/ /ItemSeq CmpG op== CachedPath IterStep axis=attribute test=id/ /CachedPath VarRef Var name=$catRefs id=4/ /VarRef /CmpG /IterFilter /GFLWOR /FnCount /QueryPlan product_client.xq Description: Binary data product_library.xq Description: Binary data WithOutModuleQueryInfo.rtf Description: RTF file
Re: [basex-talk] Performance related query.
Thanks Christian, I am able to solve the issue. Thanks, Ankit On 27 March 2015 at 17:10, Christian Grün christian.gr...@gmail.com wrote: Hi Ankit, The query info output indicates that only the first query is rewritten for index access (→ db:attribute). If you always work with the same database instance, the following version of your query should be evaluated faster: declare function lib:getCategory($products) { let $catRefs := distinct-values($products/@catid) return db:open('db')/products/p:category[@id = $catRefs] }; If the database instances are chanfing, you can pass their names on via an additional argument: declare function lib:getCategory($db, $products) { let $catRefs := distinct-values($products/@catid) return db:open($db)/products/p:category[@id = $catRefs] }; Best, Christian On Fri, Mar 27, 2015 at 11:03 AM, ankit kumar anky4b...@gmail.com wrote: Hi, Optimized query without module import count(db:attribute(large_products, distinct-values(db:open-pre(large_products,0)/products/*[@catid]/@catid))/self::id/parent::p:category[parent::products/parent::document-node()]) optimized query with module import count(let $catRefs_4 := distinct-values(db:open-pre(large_products,0)/products/*[@catid]/@catid) return ((db:open-pre(large_products,2), ...))[(@id = $catRefs_4)]) I may be wrong but as per my understanding performance difference is due db:open-pre. Since both the query are same only difference is former creates static variable in module file and latter creates local variable. Thanks Ankit On 27 March 2015 at 15:10, Christian Grün christian.gr...@gmail.com wrote: Hi Ankit, have you already compared the query info outoput? Best, Christian On Fri, Mar 27, 2015 at 10:35 AM, ankit kumar anky4b...@gmail.com wrote: Hi, I am getting performance issue while using my own xquery library. I have written an xquery module which contains a single function which return all the categories belong to a set of products as given below. Also there is one constrain that i cannot pass all the category to the getCategory() function as argument. I have to take only products as input. I tried to take /products/p:category directly instead of $lib:category variable. but it is showing root not found. So i have to defined it as global variable. =XQUERY MODULEproduct_library.xq module namespace lib = product_library; declare namespace p=a:b:c; declare variable $lib:category := /products/p:category; declare function lib:getCategory($products){ let $catRefs := distinct-values($products/@catid) return $lib:category[@id = $catRefs] }; Then i am including this library in another file, where i am invoking getCategory() function of the module. My code for that is given below. ==Client Code == product_client.xq import module 'product_library' at 'file:///C:/Users/ankumar/Desktop/product_library.xq'; declare namespace lib =product_library; declare namespace p=a:b:c; let $products := /products/*[@catid] return count(lib:getCategory($products)) Executing the above code is taking too long. So, I stopped that and write the same logic of getCategory() function of module in the same Client file as given below. =Changed Client Code== import module 'product_library' at 'file:///C:/Users/ankumar/Desktop/product_library.xq'; declare namespace lib =product_library; declare namespace p=a:b:c; let $products := /products/*[@catid] (: return count(lib:getCategory($products)) :) let $catRefs := distinct-values($products/@catid) return count(/products/p:category[@id = $catRefs]) It is executing very fast, and giving me the desired result with a second. I don't know why it is happening. My whole module is written in that way only. If you have any idea, why it is happening and how can i make this efficient then share with me. I am attaching my module file, client file,the xml instance file and the query info for with module and without module file with the mail. Is there anything to do with query optimization ?. large.xml https://docs.google.com/file/d/0B_pB7l14skhVMkxzVjZpRWJ3blU/edit?usp=drive_web
Re: [basex-talk] Performance related query.
Hi Ankit, The query info output indicates that only the first query is rewritten for index access (→ db:attribute). If you always work with the same database instance, the following version of your query should be evaluated faster: declare function lib:getCategory($products) { let $catRefs := distinct-values($products/@catid) return db:open('db')/products/p:category[@id = $catRefs] }; If the database instances are chanfing, you can pass their names on via an additional argument: declare function lib:getCategory($db, $products) { let $catRefs := distinct-values($products/@catid) return db:open($db)/products/p:category[@id = $catRefs] }; Best, Christian On Fri, Mar 27, 2015 at 11:03 AM, ankit kumar anky4b...@gmail.com wrote: Hi, Optimized query without module import count(db:attribute(large_products, distinct-values(db:open-pre(large_products,0)/products/*[@catid]/@catid))/self::id/parent::p:category[parent::products/parent::document-node()]) optimized query with module import count(let $catRefs_4 := distinct-values(db:open-pre(large_products,0)/products/*[@catid]/@catid) return ((db:open-pre(large_products,2), ...))[(@id = $catRefs_4)]) I may be wrong but as per my understanding performance difference is due db:open-pre. Since both the query are same only difference is former creates static variable in module file and latter creates local variable. Thanks Ankit On 27 March 2015 at 15:10, Christian Grün christian.gr...@gmail.com wrote: Hi Ankit, have you already compared the query info outoput? Best, Christian On Fri, Mar 27, 2015 at 10:35 AM, ankit kumar anky4b...@gmail.com wrote: Hi, I am getting performance issue while using my own xquery library. I have written an xquery module which contains a single function which return all the categories belong to a set of products as given below. Also there is one constrain that i cannot pass all the category to the getCategory() function as argument. I have to take only products as input. I tried to take /products/p:category directly instead of $lib:category variable. but it is showing root not found. So i have to defined it as global variable. =XQUERY MODULEproduct_library.xq module namespace lib = product_library; declare namespace p=a:b:c; declare variable $lib:category := /products/p:category; declare function lib:getCategory($products){ let $catRefs := distinct-values($products/@catid) return $lib:category[@id = $catRefs] }; Then i am including this library in another file, where i am invoking getCategory() function of the module. My code for that is given below. ==Client Code == product_client.xq import module 'product_library' at 'file:///C:/Users/ankumar/Desktop/product_library.xq'; declare namespace lib =product_library; declare namespace p=a:b:c; let $products := /products/*[@catid] return count(lib:getCategory($products)) Executing the above code is taking too long. So, I stopped that and write the same logic of getCategory() function of module in the same Client file as given below. =Changed Client Code== import module 'product_library' at 'file:///C:/Users/ankumar/Desktop/product_library.xq'; declare namespace lib =product_library; declare namespace p=a:b:c; let $products := /products/*[@catid] (: return count(lib:getCategory($products)) :) let $catRefs := distinct-values($products/@catid) return count(/products/p:category[@id = $catRefs]) It is executing very fast, and giving me the desired result with a second. I don't know why it is happening. My whole module is written in that way only. If you have any idea, why it is happening and how can i make this efficient then share with me. I am attaching my module file, client file,the xml instance file and the query info for with module and without module file with the mail. Is there anything to do with query optimization ?. large.xml https://docs.google.com/file/d/0B_pB7l14skhVMkxzVjZpRWJ3blU/edit?usp=drive_web