Re: Loading tab spaced data

2017-08-31 Thread Matteo Cossu
tion >> 1911 N. Fort Myer Drive, Suite 800 ♦ Arlington, VA 22209 >> Office: (703)797-3066 >> caleb.me...@parsons.com ♦ www.parsons.com >> >> -Original Message- >> From: Matteo Cossu [mailto:elco...@gmail.com] >> Sent: Tuesday, August 29, 2017 10:36 AM >&

Re: Loading tab spaced data

2017-08-29 Thread Matteo Cossu
-Original Message- > From: Matteo Cossu [mailto:elco...@gmail.com] > Sent: Tuesday, August 29, 2017 10:36 AM > To: dev@rya.incubator.apache.org > Subject: Re: Loading tab spaced data > > Hello Caleb, > I was trying to load a 53GB file (in parquet format) with 10 containers &g

RE: Loading tab spaced data

2017-08-29 Thread Meier, Caleb
: Loading tab spaced data Hello Caleb, I was trying to load a 53GB file (in parquet format) with 10 containers with assigned 15GB of memory each. Does someone have some reference numbers, like how how big a dataset can be with these resources? This could help me to know when the problem is entirely

Re: Loading tab spaced data

2017-08-29 Thread Matteo Cossu
> 1911 N. Fort Myer Drive, Suite 800 ♦ Arlington, VA 22209 > Office: (703)797-3066 > caleb.me...@parsons.com ♦ www.parsons.com > > -Original Message- > From: Matteo Cossu [mailto:elco...@gmail.com] > Sent: Monday, August 28, 2017 8:04 PM > To: dev@rya.incubator.apache.org >

RE: Loading tab spaced data

2017-08-29 Thread Meier, Caleb
@rya.incubator.apache.org Subject: Re: Loading tab spaced data I would like to help, but I still can't even test Rya properly. I'm developing for research a similar system (using Spark SQL) and I wanted to compare my software performances with Rya on the University Cluster. When I try to use these Rya tools

Re: Loading tab spaced data

2017-08-29 Thread Puja Valiyil
Hi Matteo, Rya delegates parsing of input rdf files to the rdf parsers provided by sesame/openrdf. So the issue is due to a bug with the openrdf/sesame parser, it looks like the parser doesn't like tabs. Upgrading rya to the latest release if open rdf might solve the issue. Aaron has

Re: Loading tab spaced data

2017-08-28 Thread Matteo Cossu
I would like to help, but I still can't even test Rya properly. I'm developing for research a similar system (using Spark SQL) and I wanted to compare my software performances with Rya on the University Cluster. When I try to use these Rya tools for loading the data with the big datasets, it

Re: Loading tab spaced data

2017-08-28 Thread Josh Elser
Hi Matteo, Thanks for the bug-report. Do you have an interest in making the change to Rya to address this issue? :) In open source projects, we like to encourage users to make changes to "scratch their own itch". Please let us know how we can help enable you to make this change. On

Loading tab spaced data

2017-08-25 Thread Matteo Cossu
Hello, I have some problems in loading the data with the Map Reduce code provided. I am using this class: *org.apache.rya.accumulo.mr.tools.RdfFileInputTool .* When my input data is in N-Triples format and the triples are tab separated instead of spaces, I get this error: