Re: [julia-users] Saving Julia dataframe to read in R using HDF5

2015-01-27 Thread Pavel
RCall.jl is a real breakthrough for trying to put together Julia and R 
work, thanks for the pointer! Here is my example code:
https://gist.github.com/multidis/7ac6f4779e09c986be39

The main advantage is that column types are converted properly (in 
particular Bool), and a native R object is saved in RData-file. 
Performance-wise I have not tested with large objects yet, so any advice on 
code improvement is appreciated.


On Thursday, January 22, 2015 at 6:40:35 PM UTC-8, tshort wrote:
>
> I don't know if it can do it yet, but the RCall package might be able to 
> save data back to an RData file. It's a young package.
>
> Also, you could use CSV files.
>
> On Thu, Jan 22, 2015 at 2:09 PM, Pavel 
> > wrote:
>
>> While reading R datasets in Julia received sufficient attention already, 
>> sometimes the results of computations done in Julia need to be readable to 
>> R. To accomplish that I was trying to save a DataFrame.jl 
>>  object in HDF5 file. The 
>> code so far is in my StackOverflow question (probably should have posted 
>> here instead):
>>
>> http://stackoverflow.com/questions/28084403/saving-julia-dataframe-to-read-in-r-using-hdf5
>>
>> The dataframe can then be reassembled in R using rhdf5 
>>  package 
>> tools. It works in principle, but is there a more elegant way to accomplish 
>> this? Something that does not require to split the dataframe apart and 
>> re-assemble in R, losing some column types (e.g. boolean does not work) 
>> along the way?
>>
>
>

Re: [julia-users] Saving Julia dataframe to read in R using HDF5

2015-01-22 Thread Tom Short
I don't know if it can do it yet, but the RCall package might be able to
save data back to an RData file. It's a young package.

Also, you could use CSV files.

On Thu, Jan 22, 2015 at 2:09 PM, Pavel  wrote:

> While reading R datasets in Julia received sufficient attention already,
> sometimes the results of computations done in Julia need to be readable to
> R. To accomplish that I was trying to save a DataFrame.jl
>  object in HDF5 file. The
> code so far is in my StackOverflow question (probably should have posted
> here instead):
>
> http://stackoverflow.com/questions/28084403/saving-julia-dataframe-to-read-in-r-using-hdf5
>
> The dataframe can then be reassembled in R using rhdf5
>  package
> tools. It works in principle, but is there a more elegant way to accomplish
> this? Something that does not require to split the dataframe apart and
> re-assemble in R, losing some column types (e.g. boolean does not work)
> along the way?
>


Re: [julia-users] Saving Julia dataframe to read in R using HDF5

2015-01-22 Thread Tim Holy
On Thursday, January 22, 2015 04:48:13 PM Pavel wrote:
> Thanks Tim for responding. I tried with `JLD.jldopen` instead. Now all the
> columns are saved including boolean without conversion to integer, as
> expected. However R session consistently crashes when trying to even look
> at the file structure with `rhdf5::h5ls("trydf.h5")`. Not sure if this is
> rhdf5 R-package issue or not, but something goes wrong when JLD annotations
> are present.

>From the Bioconductor website it appears that rhdf5 aims to be a generic HDF5 
interface. So if it's crashing on a *.jld file---which is an HDF5 file---then 
it 
indicates some limitation of rhdf5.

> On a more conceptual level, are R and Julia DataFrame structures too
> different to manage read/write without reassembling from separate columns?

Can't answer that, because I don't know R. Maybe someone else can. If you try 
running the jld_dataframe.jl test in HDF5.jl and inspect the results with 
h5dump, you'll see that each column is already split out for you, if you know 
where to look. (Start with the "df2" data set and follow the references.)

Best,
--Tim



Re: [julia-users] Saving Julia dataframe to read in R using HDF5

2015-01-22 Thread Pavel
Thanks Tim for responding. I tried with `JLD.jldopen` instead. Now all the 
columns are saved including boolean without conversion to integer, as 
expected. However R session consistently crashes when trying to even look 
at the file structure with `rhdf5::h5ls("trydf.h5")`. Not sure if this is 
rhdf5 R-package issue or not, but something goes wrong when JLD annotations 
are present.

On a more conceptual level, are R and Julia DataFrame structures too 
different to manage read/write without reassembling from separate columns?


On Thursday, January 22, 2015 at 3:25:06 PM UTC-8, Tim Holy wrote:
>
> In your code, could you basically replace `h5open` with `jldopen`? That 
> way 
> when you try reading the same file again with julia, you'll have all the 
> type 
> information. 
>
> JLD is basically "HDF5 with annotations that JLD knows how to interpret." 
> If 
> you're reading the file from another language, you don't have to pay 
> attention 
> to the annotations (unless you want to). 
>
> --Tim 
>
> On Thursday, January 22, 2015 11:09:25 AM Pavel wrote: 
> > While reading R datasets in Julia received sufficient attention already, 
> > sometimes the results of computations done in Julia need to be readable 
> to 
> > R. To accomplish that I was trying to save a DataFrame.jl 
> >  object in HDF5 file. The 
> code 
> > so far is in my StackOverflow question (probably should have posted here 
> > instead): 
> > 
> http://stackoverflow.com/questions/28084403/saving-julia-dataframe-to-read-i 
> > n-r-using-hdf5 
> > 
> > The dataframe can then be reassembled in R using rhdf5 
> >  
> package 
> > tools. It works in principle, but is there a more elegant way to 
> accomplish 
> > this? Something that does not require to split the dataframe apart and 
> > re-assemble in R, losing some column types (e.g. boolean does not work) 
> > along the way? 
>
>

Re: [julia-users] Saving Julia dataframe to read in R using HDF5

2015-01-22 Thread Tim Holy
In your code, could you basically replace `h5open` with `jldopen`? That way 
when you try reading the same file again with julia, you'll have all the type 
information.

JLD is basically "HDF5 with annotations that JLD knows how to interpret." If 
you're reading the file from another language, you don't have to pay attention 
to the annotations (unless you want to).

--Tim

On Thursday, January 22, 2015 11:09:25 AM Pavel wrote:
> While reading R datasets in Julia received sufficient attention already,
> sometimes the results of computations done in Julia need to be readable to
> R. To accomplish that I was trying to save a DataFrame.jl
>  object in HDF5 file. The code
> so far is in my StackOverflow question (probably should have posted here
> instead):
> http://stackoverflow.com/questions/28084403/saving-julia-dataframe-to-read-i
> n-r-using-hdf5
> 
> The dataframe can then be reassembled in R using rhdf5
>  package
> tools. It works in principle, but is there a more elegant way to accomplish
> this? Something that does not require to split the dataframe apart and
> re-assemble in R, losing some column types (e.g. boolean does not work)
> along the way?



[julia-users] Saving Julia dataframe to read in R using HDF5

2015-01-22 Thread Pavel
While reading R datasets in Julia received sufficient attention already, 
sometimes the results of computations done in Julia need to be readable to 
R. To accomplish that I was trying to save a DataFrame.jl 
 object in HDF5 file. The code 
so far is in my StackOverflow question (probably should have posted here 
instead):
http://stackoverflow.com/questions/28084403/saving-julia-dataframe-to-read-in-r-using-hdf5

The dataframe can then be reassembled in R using rhdf5 
 package 
tools. It works in principle, but is there a more elegant way to accomplish 
this? Something that does not require to split the dataframe apart and 
re-assemble in R, losing some column types (e.g. boolean does not work) 
along the way?