Hi Adrian,

I was looking into this for the past 2 days.
I will just share what all I found.

So, in the `python/test/DataModel.xml` file, the file path for the two csv 
files (`Housing-prices-Boston.csv` and `table.csv`)
are hard coded like the following 
`/home/aroles/Documents/proj_persalys/studies/Housing-prices-Boston.csv` and
`/home/aroles/Documents/proj_persalys/studies/table.csv`. This is where the 
issue starts. Since, this is a hard coded absolute path,
the program doesn't find these files. Hence for line no. 493 in 
`lib/src/base/DataModel.cxx` in the function call
`ImportedDataset(fileName, inputColumns, outputColumns)`, it throws an 
exception and goes into the catch block and hence falls
back to the `.h5` file. Now the `.h5` file has entries in little endian format. 
Hence, in x86 machines, the data is read fine but fails on
s390x. But the thing is, HDF5 files are supposed to be portable across 
architectures (irrespective of endianness). I can see that the
call is moving into `libOT.so`. So will have to dig deeper over there to see 
why this is exactly happening.

Meanwhile, I tried correcting the paths (by changing them to relative paths). 
Now the first assertion (line 14 of `t_DataModel_load.py`)
now passes. This is since that the `.csv` file is being picked up rather than 
the `.h5` file. But still, the second assertion fails. That is when
I noticed another thing. The file called `table.csv` doesn't exist within the 
repo. I am not sure whether that is supposed to be produced
by another test case or was supposed to uploaded. But since the file is not 
there the control falls back to the `.h5` file and chaos
continues.

If there was a `table.csv` file, then changing the path names and uploading 
this file should fix the issue. Or another work around we
could do is to add the necessary byte swaps in the `t_DataModel_load.py` file.

Thanks,
Pranav

Reply via email to