Re: [basex-talk] Huge CSV

2018-08-12 Thread Giuseppe Celano
Yes, I build them, but I do not use them explicitly all the time.  

> On Aug 13, 2018, at 12:04 AM, Liam R. E. Quin  wrote:
> 
> On Sun, 2018-08-12 at 23:58 +0200, Giuseppe Celano wrote:
>> more documents accessed sequentially is better than one
>> big file.
> 
> Are you building indexes in the database? Do yourqueries make use of
> them?
> 
> You may find using the full text extensions useful.
> 
> Liam
> 
> 
> -- 
> Liam Quin, https://www.holoweb.net/liam/cv/
> Web slave for vintage clipart http://www.fromoldbooks.org/
> Available for XML/Document/Information Architecture/
> XSL/XQuery/Web/Text Processing/A11Y work & consulting.
> 



Re: [basex-talk] Huge CSV

2018-08-12 Thread Liam R. E. Quin
On Sun, 2018-08-12 at 23:58 +0200, Giuseppe Celano wrote:
> more documents accessed sequentially is better than one
> big file.

Are you building indexes in the database? Do yourqueries make use of
them?

You may find using the full text extensions useful.

Liam


-- 
Liam Quin, https://www.holoweb.net/liam/cv/
Web slave for vintage clipart http://www.fromoldbooks.org/
Available for XML/Document/Information Architecture/
XSL/XQuery/Web/Text Processing/A11Y work & consulting.



Re: [basex-talk] Huge CSV

2018-08-12 Thread Giuseppe Celano
Hi Liam,

Thanks for answering. The problem is not only the XML transformation per se, 
but also the subsequent query of the documents. I see that if I parcel the big 
csv into smaller (XML) documents and query them sequentially, I have no
performance problems. This is also the case in the database, as far as I can 
see: more documents accessed sequentially is better than one big file.

Ciao,
Giuseppe 


> On Aug 10, 2018, at 9:09 PM, Liam R. E. Quin  wrote:
> 
> On Fri, 2018-08-10 at 13:43 +0200, Giuseppe Celano wrote:
>> I uploaded the file, as it is, in the database,
> 
> i'd probably look for an XSLT transformation to turn it into XSLT - of
> there are python and perl scripts or other programs that can do it -
> and then load the result intoa database.
> 
> It's not all that large a file, so maybe it'd help if you described the
> exact problems you were having -- what did you try, what did you expect
> to happen, what actually happen, what steps did you take to
> investigate...
> 
> Liam
> 
> 
> -- 
> Liam Quin, https://www.holoweb.net/liam/cv/
> Web slave for vintage clipart http://www.fromoldbooks.org/
> Available for XML/Document/Information Architecture/
> XSL/XQuery/Web/Text Processing/A11Y work & consulting.
> 



Re: [basex-talk] Huge CSV

2018-08-10 Thread Liam R. E. Quin
On Fri, 2018-08-10 at 13:43 +0200, Giuseppe Celano wrote:
> I uploaded the file, as it is, in the database,

i'd probably look for an XSLT transformation to turn it into XSLT - of
there are python and perl scripts or other programs that can do it -
and then load the result intoa database.

It's not all that large a file, so maybe it'd help if you described the
exact problems you were having -- what did you try, what did you expect
to happen, what actually happen, what steps did you take to
investigate...

Liam


-- 
Liam Quin, https://www.holoweb.net/liam/cv/
Web slave for vintage clipart http://www.fromoldbooks.org/
Available for XML/Document/Information Architecture/
XSL/XQuery/Web/Text Processing/A11Y work & consulting.



Re: [basex-talk] Huge CSV

2018-08-10 Thread Giuseppe Celano
I uploaded it as csv (it is csv) via the GUI and it is then converted into XML 
(this conversion probably makes it too big)


> On Aug 10, 2018, at 1:50 PM, Christian Grün  wrote:
> 
>> I uploaded the file, as it is, in the database
> 
> So you uploaded the file as binary? Did you try to import it as XML,
> too? Does »upload« mean that you used the simple REST API?
> 



Re: [basex-talk] Huge CSV

2018-08-10 Thread Christian Grün
> I uploaded the file, as it is, in the database

So you uploaded the file as binary? Did you try to import it as XML,
too? Does »upload« mean that you used the simple REST API?


Re: [basex-talk] Huge CSV

2018-08-10 Thread Giuseppe Celano
I uploaded the file, as it is, in the database, but this does not help. The 
idea was to preliminary transform the file into xml and then query it, but this 
cannot be done on the fly. So the only thing I can think of is to parcel the 
original csv file into multiple csv files and then tranform each of them in 
xml, and then query these latter. Are there alternatives? Thanks.

Giuseppe

Universität Leipzig
Institute of Computer Science, NLP
Augustusplatz 10
04109 Leipzig
Deutschland
E-mail: cel...@informatik.uni-leipzig.de
E-mail: giuseppegacel...@gmail.com
Web site 1: http://asv.informatik.uni-leipzig.de/en/staff/Giuseppe_Celano 
Web site 2: https://sites.google.com/site/giuseppegacelano/

> On Aug 10, 2018, at 1:37 PM, Christian Grün  wrote:
> 
> As there are many different ways to process large CSV data with BaseX…
> What did you try so far?
> 
> 
> On Fri, Aug 10, 2018 at 1:36 PM Giuseppe Celano
>  wrote:
>> 
>> Hi,
>> 
>> I am trying to work with a huge CSV file (about 380 MB), but If I built the 
>> database it seems that even simple operations cannot be evaluated. Is 
>> splitting the CSV file the only option or am I missing something here? 
>> Thanks.
>> 
>> Giuseppe
>> 
>> 
> 



Re: [basex-talk] Huge CSV

2018-08-10 Thread Christian Grün
As there are many different ways to process large CSV data with BaseX…
What did you try so far?


On Fri, Aug 10, 2018 at 1:36 PM Giuseppe Celano
 wrote:
>
> Hi,
>
> I am trying to work with a huge CSV file (about 380 MB), but If I built the 
> database it seems that even simple operations cannot be evaluated. Is 
> splitting the CSV file the only option or am I missing something here? Thanks.
>
> Giuseppe
>
>


[basex-talk] Huge CSV

2018-08-10 Thread Giuseppe Celano
Hi,

I am trying to work with a huge CSV file (about 380 MB), but If I built the 
database it seems that even simple operations cannot be evaluated. Is splitting 
the CSV file the only option or am I missing something here? Thanks.

Giuseppe