I have better luck with inds = fill(:,3)
By the way, if anyone appropriate is watching, can we have a sticky post about how to format Julia code here? And is the comprehension form of a one-line "for" loop considered good style? I don't see it in the manual anywhere. On Tuesday, September 13, 2016 at 9:36:58 PM UTC-4, sparrowhawker wrote: > > Cool! The colons approach makes sense to me, followed by splatting. > > I'm unfamiliar with the syntax here but when I try to create a tuple in > the REPL using > > inds = ((:) for i in 1:3) > > I get > ERROR: syntax: missing separator in tuple > > > > On 13 September 2016 at 17:27, Erik Schnetter <[email protected] > <javascript:>> wrote: > >> If you have a varying rank, then you should probably use something like >> `CartesianIndex` and `CartesianRange` to represent the indices, or possible >> tuples of integers. You would then use the splatting operator to create the >> indexing instructions: >> >> ```Julia >> indrange = CartesianRange(xyz) >> dset[indrange..., i] = slicedim >> ``` >> >> I don't know whether the expression `indrange...` works as-is, or whether >> you have to manually create a tuple of `UnitRange`s. >> >> If you want to use colons, then you'd write >> >> ```Julia >> inds = ((:) for i in 1:rank) >> dset[inds..., i] = xyz >> ``` >> >> -erik >> >> >> >> >> On Tue, Sep 13, 2016 at 5:08 PM, Anandaroop Ray <[email protected] >> <javascript:>> wrote: >> >>> Many thanks for your comprehensive recommendations. I think HDF5 views >>> are probably what I need to go with - will read up more and then ask. >>> >>> What I mean about dimension is rank, really. The shape is always the >>> same for all samples. One slice for storage, i.e., one sample, could be >>> chunked as dset[:,:,i] or dset[:,:,:,:,i] but always of the form, >>> dset[:,...,:i], depending on input to the code at runtime. >>> >>> Thanks >>> >>> On 13 September 2016 at 14:47, Erik Schnetter <[email protected] >>> <javascript:>> wrote: >>> >>>> On Tue, Sep 13, 2016 at 11:27 AM, sparrowhawker <[email protected] >>>> <javascript:>> wrote: >>>> >>>>> Hi, >>>>> >>>>> I'm new to Julia, and have been able to accomplish a lot of what I >>>>> used to do in Matlab/Fortran, in very little time since I started using >>>>> Julia in the last three months. Here's my newest stumbling block. >>>>> >>>>> I have a process which creates nsamples within a loop. Each sample >>>>> takes a long time to compute as there are expensive finite difference >>>>> operations, which ultimately lead to a sample, say 1 to 10 seconds. I >>>>> have >>>>> to store each of the nsamples, and I know the size and dimensions of each >>>>> of the nsamples (all samples have the same size and dimensions). However, >>>>> depending on the run time parameters, each sample may be a 32x32 image or >>>>> perhaps a 64x64x64 voxset with 3 attributes, i.e., a 64x64x64x3 >>>>> hyper-rectangle. To be clear, each sample can be an arbitrary dimension >>>>> hyper-rectangle, specified at run time. >>>>> >>>>> Obviously, since I don't want to lose computation and want to see >>>>> incremental progress, I'd like to do incremental saves of these samples >>>>> on >>>>> disk, instead of waiting to collect all nsamples at the end. For >>>>> instance, >>>>> if I had to store 1000 samples of size 64x64, I thought perhaps I could >>>>> chunk and save 64x64 slices to an HDF5 file 1000 times. Is this the right >>>>> approach? If so, here's a prototype program to do so, but it depends on >>>>> my >>>>> knowing the number of dimensions of the slice, which is not known until >>>>> runtime, >>>>> >>>>> using HDF5 >>>>> >>>>> filename = "test.h5" >>>>> # open file >>>>> fmode ="w" >>>>> # get a file object >>>>> fid = h5open(filename, fmode) >>>>> # matrix to write in chunks >>>>> B = rand(64,64,1000) >>>>> # figure out its dimensions >>>>> sizeTuple = size(B) >>>>> Ndims = length(sizeTuple) >>>>> # set up to write in chunks of sizeArray >>>>> sizeArray = ones(Int, Ndims) >>>>> [sizeArray[i] = sizeTuple[i] for i in 1:(Ndims-1)] # last value of >>>>> size array is :...:,1 >>>>> # create a dataset models within root >>>>> dset = d_create(fid, "models", datatype(Float64), dataspace(size(B)), >>>>> "chunk", sizeArray) >>>>> [dset[:,:,i] = slicedim(B, Ndims, i) for i in 1:size(B, Ndims)] >>>>> close(fid) >>>>> >>>>> This works, but the second last line, dset[:,:,i] requires syntax >>>>> specific to writing a slice of a dimension 3 array - but I don't know the >>>>> dimensions until run time. Of course I could just write to a flat binary >>>>> file incrementally, but HDF5.jl could make my life so much simpler! >>>>> >>>> >>>> HDF5 supports "extensible datasets", which were created for use cases >>>> such as this one. I don't recall the exact syntax, but if I recall >>>> correctly, you can specify one dimension (the first one in C, the last one >>>> in Julia) to be extensible, and then you can add more data as you go. You >>>> will probably need to specify a chunk size, which could be the size of the >>>> increment in your case. Given file system speeds, a chunk size smaller >>>> than >>>> a few MegaBytes probably don't make much sense (i.e. will slow things >>>> down). >>>> >>>> If you want to monitor the HDF5 file as it is being written, look at >>>> the SWIMR feature. This requires HDF5 1.10; unfortunately, Julia will by >>>> default often still install version 1.8. >>>> >>>> If you want to protect against crashes of your code so that you don't >>>> lose progress, then HDF5 is probably not right for you. Once an HDF5 file >>>> is open for writing, the on-disk state might be inconsistent, so that you >>>> can lose all data when your code crashes. In this case, you might want to >>>> write data into different files, one per increment. HDF5 1.0 offers >>>> "views", which are umbrella files that stitch together datasets stored in >>>> other files. >>>> >>>> If you are looking for generic advice for setting up things with HDF5, >>>> then I recommend their documentation. If you are looking for how to access >>>> these features in Julia, or if you notice a feature that is not available >>>> in Julia, then we'll be happy to explain or correct things. >>>> >>>> What do mean by "dimension only known at run time" -- do you mean what >>>> Julia calls "size" (shape) or what Julia calls "dim" (rank)? >>>> >>>> Do all datasets have the same size, or do they differ? If they differ, >>>> then putting them into the same dataset might not make sense; in this >>>> case, >>>> I would write them into different datasets. >>>> >>>> -erik >>>> >>>> -- >>>> Erik Schnetter <[email protected] <javascript:>> >>>> http://www.perimeterinstitute.ca/personal/eschnetter/ >>>> >>> >>> >> >> >> -- >> Erik Schnetter <[email protected] <javascript:>> >> http://www.perimeterinstitute.ca/personal/eschnetter/ >> > >
