Re: [basex-talk] Huge CSV
Yes, I build them, but I do not use them explicitly all the time. > On Aug 13, 2018, at 12:04 AM, Liam R. E. Quin wrote: > > On Sun, 2018-08-12 at 23:58 +0200, Giuseppe Celano wrote: >> more documents accessed sequentially is better than one >> big file. > > Are you building indexes in the database? Do yourqueries make use of > them? > > You may find using the full text extensions useful. > > Liam > > > -- > Liam Quin, https://www.holoweb.net/liam/cv/ > Web slave for vintage clipart http://www.fromoldbooks.org/ > Available for XML/Document/Information Architecture/ > XSL/XQuery/Web/Text Processing/A11Y work & consulting. >
Re: [basex-talk] Huge CSV
On Sun, 2018-08-12 at 23:58 +0200, Giuseppe Celano wrote: > more documents accessed sequentially is better than one > big file. Are you building indexes in the database? Do yourqueries make use of them? You may find using the full text extensions useful. Liam -- Liam Quin, https://www.holoweb.net/liam/cv/ Web slave for vintage clipart http://www.fromoldbooks.org/ Available for XML/Document/Information Architecture/ XSL/XQuery/Web/Text Processing/A11Y work & consulting.
Re: [basex-talk] Huge CSV
Hi Liam, Thanks for answering. The problem is not only the XML transformation per se, but also the subsequent query of the documents. I see that if I parcel the big csv into smaller (XML) documents and query them sequentially, I have no performance problems. This is also the case in the database, as far as I can see: more documents accessed sequentially is better than one big file. Ciao, Giuseppe > On Aug 10, 2018, at 9:09 PM, Liam R. E. Quin wrote: > > On Fri, 2018-08-10 at 13:43 +0200, Giuseppe Celano wrote: >> I uploaded the file, as it is, in the database, > > i'd probably look for an XSLT transformation to turn it into XSLT - of > there are python and perl scripts or other programs that can do it - > and then load the result intoa database. > > It's not all that large a file, so maybe it'd help if you described the > exact problems you were having -- what did you try, what did you expect > to happen, what actually happen, what steps did you take to > investigate... > > Liam > > > -- > Liam Quin, https://www.holoweb.net/liam/cv/ > Web slave for vintage clipart http://www.fromoldbooks.org/ > Available for XML/Document/Information Architecture/ > XSL/XQuery/Web/Text Processing/A11Y work & consulting. >
Re: [basex-talk] Huge CSV
On Fri, 2018-08-10 at 13:43 +0200, Giuseppe Celano wrote: > I uploaded the file, as it is, in the database, i'd probably look for an XSLT transformation to turn it into XSLT - of there are python and perl scripts or other programs that can do it - and then load the result intoa database. It's not all that large a file, so maybe it'd help if you described the exact problems you were having -- what did you try, what did you expect to happen, what actually happen, what steps did you take to investigate... Liam -- Liam Quin, https://www.holoweb.net/liam/cv/ Web slave for vintage clipart http://www.fromoldbooks.org/ Available for XML/Document/Information Architecture/ XSL/XQuery/Web/Text Processing/A11Y work & consulting.
Re: [basex-talk] Huge CSV
I uploaded it as csv (it is csv) via the GUI and it is then converted into XML (this conversion probably makes it too big) > On Aug 10, 2018, at 1:50 PM, Christian Grün wrote: > >> I uploaded the file, as it is, in the database > > So you uploaded the file as binary? Did you try to import it as XML, > too? Does »upload« mean that you used the simple REST API? >
Re: [basex-talk] Huge CSV
> I uploaded the file, as it is, in the database So you uploaded the file as binary? Did you try to import it as XML, too? Does »upload« mean that you used the simple REST API?
Re: [basex-talk] Huge CSV
I uploaded the file, as it is, in the database, but this does not help. The idea was to preliminary transform the file into xml and then query it, but this cannot be done on the fly. So the only thing I can think of is to parcel the original csv file into multiple csv files and then tranform each of them in xml, and then query these latter. Are there alternatives? Thanks. Giuseppe Universität Leipzig Institute of Computer Science, NLP Augustusplatz 10 04109 Leipzig Deutschland E-mail: cel...@informatik.uni-leipzig.de E-mail: giuseppegacel...@gmail.com Web site 1: http://asv.informatik.uni-leipzig.de/en/staff/Giuseppe_Celano Web site 2: https://sites.google.com/site/giuseppegacelano/ > On Aug 10, 2018, at 1:37 PM, Christian Grün wrote: > > As there are many different ways to process large CSV data with BaseX… > What did you try so far? > > > On Fri, Aug 10, 2018 at 1:36 PM Giuseppe Celano > wrote: >> >> Hi, >> >> I am trying to work with a huge CSV file (about 380 MB), but If I built the >> database it seems that even simple operations cannot be evaluated. Is >> splitting the CSV file the only option or am I missing something here? >> Thanks. >> >> Giuseppe >> >> >
Re: [basex-talk] Huge CSV
As there are many different ways to process large CSV data with BaseX… What did you try so far? On Fri, Aug 10, 2018 at 1:36 PM Giuseppe Celano wrote: > > Hi, > > I am trying to work with a huge CSV file (about 380 MB), but If I built the > database it seems that even simple operations cannot be evaluated. Is > splitting the CSV file the only option or am I missing something here? Thanks. > > Giuseppe > >
[basex-talk] Huge CSV
Hi, I am trying to work with a huge CSV file (about 380 MB), but If I built the database it seems that even simple operations cannot be evaluated. Is splitting the CSV file the only option or am I missing something here? Thanks. Giuseppe