Re: [basex-talk] Huge No of XML files.

2019-12-18 Thread Liam R. E. Quin
On Wed, 2019-12-18 at 11:10 +0530, Sreenivasulu Yadavalli wrote:
> > 
> What exactly do you mean by moving collections around?.
> 
> A: moving the collections in the same system. 

So, you use the Linux "mv" command to do this? Or what?

What exactly do you mean by collections? I for one would find it easier
if you would stop talking in riddles, as my telepathy skills are weak.

> And every day we have to
> update the existing collection with call data. So finding the
> collection is
> taking more time

How do you look for the collection? Isn't it a separate BaseX database?

> 
> Are you taking a database with 100 million documents and renaming
> 50,000 of them?
> 
> What operations exactly are slow?
> 
> A: finding the existing collection.

find / -name collection.db ?

This is a little frustrating in that you are asking for people's help
but not explaining the problem. Are you saying that fn:collection() is
slow in BaseX? What arguments are you passing it exactly? What is the
size, in gigabytes, of the database, on disk? How many documents are in
it?

Can you give step-by-step EXACT AND PRECISSE instructions so someone
else could reproduce the problem you have having? Complete and exact
instructions, with sample files if needed, so they can reproduce the
problem on their own computer?

A database with 80,000 files is easy to "find" here, and opens quickly,
in a small fraction of a second. It doesn't take hours.

Is something else running on your computer that makes it slow??

Note: please remember to copy the list in your replies, as the BaseX
people are far more knowledgeable about BaseX than i am :) My goal as
an analyst is to get you to explain the problem you are having clearly
enough that you can get an answer :)

Liam

-- 
Liam Quin, https://www.delightfulcomputing.com/
Available for XML/Document/Information Architecture/XSLT/
XSL/XQuery/Web/Text Processing/A11Y training, work & consulting.
Barefoot Web-slave, antique illustrations:  http://www.fromoldbooks.org



Re: [basex-talk] Huge No of XML files.

2019-12-17 Thread Liam R. E. Quin
On Tue, 2019-12-17 at 11:48 +0530, Sreenivasulu Yadavalli wrote:
> 
> Every day we are moving collections around 55k to 60k no of xml files
> large
> account.


Here, i just created a BaseX database with 80,000 XML files. It took
under one minute on the Linux desktop system i use.

>  Its taking more than 18 hours.
This make no sense. How much memory do you have on the computer?

What exactly do you mean by moving collections around?

Are you taking a database with 100 million documents and renaming
50,000 of them?

What operations exactly are slow?

Liam

-- 
Liam Quin, https://www.delightfulcomputing.com/
Available for XML/Document/Information Architecture/XSLT/
XSL/XQuery/Web/Text Processing/A11Y training, work & consulting.
Barefoot Web-slave, antique illustrations:  http://www.fromoldbooks.org



Re: [basex-talk] Huge No of XML files.

2019-12-16 Thread Christian Grün
> A: Every day we are generating collections (Unbilled Mobile Transactionn 
> data) Per account appr 50k - 55k files.

So you are creating a new collection every day, and it will have
around 60k documents at the end of the day? Will you discard older
collections? And I assume you need to incrementally add the documents,
or do you have all of them at hand before you run the first queries?

In the latter case, you might need to partition your database and work
with multiple instances (see [1,2] for details). In both cases, it
might be recommendable to experience with the UPDINDEX option [3].

> 4. How do your queries for reporting data look like?

Feel invited to give us more information on the database that you
accessing in your query.

[1] http://docs.basex.org/wiki/Databases
[2] http://docs.basex.org/wiki/Statistics
[3] http://docs.basex.org/wiki/Options#UPDINDEX


Re: [basex-talk] Huge No of XML files.

2019-12-16 Thread Sreenivasulu Yadavalli
Hi Christian,

1. What is the average size of each of your document?
A:  5k to 7k
2. Will this sum up to a yearly amount of appr. 20 million files, or
do you have an upper limit?
A: Every day we are generating collections (Unbilled Mobile Transactionn
data) Per account appr 50k - 55k files.
3. Which of our API(s) are you using to add the documents?
A: import org.basex.server.ClientSession;
4. How do your queries for reporting data look like?
A:

Usage Summary Per Line:10 minutes

declare function local:getLabelName($serNo as xs:string) {  let $doc :=
collection('LabelCollection_1573624392934')   let $label :=
$doc//row[@ser_no = $serNo] let $lbCount:=count($doc//row[@ser_no =
$serNo])  return if(data($label) != '' and $lbCount<=1) then
data($label) else ('') }; declare function local:getRecords() {  let
$Rows :=
(collection("867509_Voice_OCTOBER-19_Unbilled_1")/SUBCUSTBRK[AccNo[@NO =
(6945045)]][DOB >='2019-10-01' and DOB <='2019-10-31'])   let $billNo :=
data($Rows/CONN/@NO)  let $topRecords := $billNo
   for $details in  $Rows[CONN[data(@NO)=$topRecords]]/Country/ROW
group by $billNo := data($details/../../CONN/@NO)
  return  {   element Acno
{distinct-values(data($details/../../AccNo/@NO))},   element
BNO {$billNo},   element SLBl
{distinct-values(local:getLabelName($billNo))},   element Stat
{}, for $serviceDetails in $details   group by $groupService :=
data($serviceDetails/@ServType)   return (element
{concat($groupService,'C') } {xs:decimal(sum(data($serviceDetails/@Calls)))
} ), element TCls {xs:decimal(sum(data($details/@Calls)))} } }; (
  let $records := local:getRecords()[position() le 1]
let $totalCalls := sum($records/TCls)  let $pstncls :=
xs:decimal(sum($records/PSTNC)) let $vpncls :=
xs:decimal(sum($records/VPNC)) let $tnbscls :=
xs:decimal(sum($records/TNBSC)) let $fmccls :=
xs:decimal(sum($records/FMCC)) return   ($records,
   if(count($records) > 0) then ( { { 'Total' },
, , , { xs:decimal($pstncls)
},  { xs:decimal($vpncls) }, {
xs:decimal($tnbscls) },  { xs:decimal($fmccls) },
{ xs:decimal($totalCalls) } })else ())
  )


Trend Report by Call type:4 minutes

declare function local:getDurationPer($ctDur as xs:double, $ctTotalDur as
xs:double) {if($ctTotalDur=0)then()else(   let $ctPer :=
concat(xs:decimal(round-half-to-even((($ctDur div $ctTotalDur) * 100), 3)),
"%")   return $ctPer) };
declare function local:getHHMMSSFromNumber($num as xs:double) {  let $ss :=
xs:integer($num mod 60)  let $mm1 := xs:integer($num div 60)  let $mm :=
xs:integer($mm1 mod 60)  let $hh := xs:integer($mm1 div 60)  let $hhFinal
:= if(string-length(xs:string($hh)) >= 2) then $hh else (concat('0', $hh))
 let $mmFinal := if(string-length(xs:string($mm)) >= 2) then $mm else
(concat('0', $mm))  let $ssFinal := if(string-length(xs:string($ss)) >= 2)
then $ss else (concat('0', $ss))  return
concat($hhFinal,':',$mmFinal,':',$ssFinal) };
declare function local:getMMSSFromNumber($num as xs:double) {  let $ss :=
xs:integer($num mod 60)  let $mm := xs:integer($num div 60)  let $mmFinal
:= if(string-length(xs:string($mm)) >= 2) then $mm else (concat('0', $mm))
 let $ssFinal := if(string-length(xs:string($ss)) >= 2) then $ss else
(concat('0', $ss))  return concat($mmFinal,':',$ssFinal) };

let $subbrk :=
(collection("867509_Voice_OCTOBER-19_Unbilled_1")/SUBCUSTBRK[AccNo[@NO =
(6945045)]][DOB >= '2019-10-01' and DOB <= '2019-10-31']) let
$detailUsageTxn := $subbrk/DETAIL/TRANSACTION[@Usage = 'usage'] let
$DistCallTypes := distinct-values($detailUsageTxn/SUB_SECTION/@Type) let
$allmonths := distinct-values($subbrk/Month) let $rowcnt :=
count($detailUsageTxn/SUB_SECTION) let $months := $allmonths[position() le
1] let $opttype := 'NumVal' let $durop := 'hh:mm:ss' return   (
if($DistCallTypes!='')then(for $month in $months   return 
 { $month }{   if($opttype = 'NumVal') then
for $ct in $DistCallTypes  let
$ctCnt := sum($detailUsageTxn/SUB_SECTION[data(@Type) = data($ct) and
data(../../../Month) = $month]/@Calls) return element {
replace(concat('_',data($ct)),'(\.|\[|\]|\\|\||\-|\^|\$|\?|\*|\+|\{|\}|\(|\)|
|')', '_') } {xs:decimal($ctCnt)}   else
  if($opttype = 'NumPerViewPoint') then   for $ct
in $DistCallTypes   let $ctCnt :=
sum($detailUsageTxn/SUB_SECTION[data(@Type) = data($ct) and
data(../../../Month) = $month]/@Calls)   let
$ctTotalCnt := sum($detailUsageTxn/SUB_SECTION[data(@Type) =
data($ct)]/@Calls)   let $ctPer :=
concat(xs:decimal(round-half-to-even((($ctCnt div $ctTotalCnt) * 100), 3)),
'%')   return element {
replace(concat('_',data($ct)),'(\.|\[|\]|\\|\||\-|\^|\$|\?|\*|\+|\{|\}|\(|\)|
|')', '_') } {$ctPer} else
if($opttype = 'NumPerCallType') then for $ct in
$Dist

Re: [basex-talk] Huge No of XML files.

2019-12-16 Thread Christian Grün
> Pls help me.

Trying. Could you try to help us first und answer all of the questions
of my initial reply?


Re: [basex-talk] Huge No of XML files.

2019-12-16 Thread Sreenivasulu Yadavalli
Hi Christian,

Every day we are generating upto 50k xml files per account. At the time of
collection generation i need to generate report on the selected account.
Then immediately that collection got terminated. Not able to generate
report. Even collection creation time is also taken more than 15 hours that
particular account.

Pls help me.

Regards,
YSL

On Tue, Dec 17, 2019 at 12:27 PM Christian Grün 
wrote:

> Hi YSL,
>
> > Every day we are moving collections around 55k to 60k no of xml files
> large account. Its taking more than 18 hours. At that time we want access
> the collection for generating report its on lock mode and scrip that
> collection.
>
> Some questions back:
>
> 1. What is the average size of each of your document?
> 2. Will this sum up to a yearly amount of appr. 20 million files, or
> do you have an upper limit?
> 3. Which of our API(s) are you using to add the documents?
> 4. How do your queries for reporting data look like?
>
> > Please help and do needful.
>
> https://ell.stackexchange.com/a/17626 ;)
>
> Best,
> Christian
>


Re: [basex-talk] Huge No of XML files.

2019-12-16 Thread Christian Grün
Hi YSL,

> Every day we are moving collections around 55k to 60k no of xml files large 
> account. Its taking more than 18 hours. At that time we want access the 
> collection for generating report its on lock mode and scrip that collection.

Some questions back:

1. What is the average size of each of your document?
2. Will this sum up to a yearly amount of appr. 20 million files, or
do you have an upper limit?
3. Which of our API(s) are you using to add the documents?
4. How do your queries for reporting data look like?

> Please help and do needful.

https://ell.stackexchange.com/a/17626 ;)

Best,
Christian


[basex-talk] Huge No of XML files.

2019-12-16 Thread Sreenivasulu Yadavalli
Hi Team,





We have a big problem for creating the collection bcoz of large no of xmls.



Every day we are moving collections around 55k to 60k no of xml files large
account. Its taking more than 18 hours. At that time we want access the
collection for generating report its on lock mode and scrip that collection.



Please help and do needful.



Regards,

YSL