On Wed, 27 Nov 2019, Leonidas Liakos via R-sig-Geo wrote:

Thank you for your help!

I tried to fix stackApply according to your instructions.

Now the indices of names are the same and consistent with indices
enumeration (gist for validation and tests:
https://gist.github.com/kokkytos/93f315a5ecf59c0b183f9788754bc170).

I've attached a patch file here:

https://gist.github.com/kokkytos/ca2c319134677b19900579665267a7a7

Thanks very much for contributing!

Please consider raising an issue on https://github.com/rspatial/raster pointing to this thread and your patch. I had expected response from raster developers here, but they may well be on field work, so raising an issue on the development site should get their attention when there is enough time for them to look. You might even fork raster, apply your patch and file a PR in addition to the issue. In that case, a short test would be helpful, and maybe edits to the documentation.

Roger



stackapply_mean
class      : RasterBrick
dimensions : 300, 300, 90000, 7  (nrow, ncol, ncell, nlayers)
resolution : 500, 500  (x, y)
extent     : 0, 150000, 0, 150000  (xmin, xmax, ymin, ymax)
crs        : NA
source     : /tmp/Rtmp9W8UNc/raster/r_tmp_2019-11-27_191205_2929_20324.grd
names      :  index_5,  index_6,  index_7,  index_1,  index_2, 
index_3,  index_4
min values : 444.6946, 440.2028, 429.6900, 442.7436, 440.0467, 444.9182,
437.1589
max values : 565.8671, 560.1375, 561.7972, 556.2471, 563.8341, 561.7687,
560.4509

ver_mean
class      : RasterStack
dimensions : 300, 300, 90000, 7  (nrow, ncol, ncell, nlayers)
resolution : 500, 500  (x, y)
extent     : 0, 150000, 0, 150000  (xmin, xmax, ymin, ymax)
crs        : NA
names      :  layer.1,  layer.2,  layer.3,  layer.4,  layer.5, 
layer.6,  layer.7
min values : 442.7436, 440.0467, 444.9182, 437.1589, 444.6946, 440.2028,
429.6900
max values : 556.2471, 563.8341, 561.7687, 560.4509, 565.8671, 560.1375,
561.7972





On 11/26/19 10:58 PM, Vijay Lulla wrote:
Hmm...it appears that stackApply is using different conditions which
might be causing this problem. Below is the snippet of the code which
I think might be the problem.

## For canProcessInMemory
if (rowcalc) {
  v <- lapply(uin, function(i) fun(x[, ind == i, drop = FALSE], na.rm
= na.rm))
}
else {
  v <- lapply(uin, function(i, ...) apply(x[, ind == i, drop = FALSE],
1, fun, na.rm = na.rm))
}


## If canProcessInMemory is not TRUE
if (rowcalc) {
  v <- lapply(uin, function(i) fun(a[, ind == uin[i], drop = FALSE],
na.rm = na.rm))
}
else {
  v <- lapply(uin, function(i, ...) apply(a[, ind == uin[i], drop =
FALSE], 1, fun, na.rm = na.rm))
}

I think they should both be same but it appears that they aren't and
that's what you've discovered.  Maybe you can try fix(stackApply) to
see if it really is the problem and can tell us what you find. 
Anyways, good catch...and...sorry for wasting your time.
Cordially,
Vijay.

On Tue, Nov 26, 2019 at 2:53 PM Leonidas Liakos
<leonidas_lia...@yahoo.gr <mailto:leonidas_lia...@yahoo.gr>> wrote:

    Thank you!
    The problem is not with the resulting values but with the index
    mapping. Values are correct in all three cases.

    As I wrote in a previous post in the thread
    (https://stat.ethz.ch/pipermail/r-sig-geo/2019-November/027821.html)
    , stackApply behaves inconsistently depending on whether the
    exported stack will remain in memory or it will be stored, due to
    its large size, on the hard disk.

    In the first case the indices are identical to my function
    (ver_mean) and the lubridate::wday indexing system (and are
    correct) while in the second they are shuffled.

    So, while Sunday has index 1 and while in the first case (when the
    result is in memory) stackApply returns the correct index, in the
    second case (when the result is stored on the hard disk) it
    returns index_4! So how can one be sure if index_1 corresponds to
    Sunday or another day using stackApply since it sometimes
    enumerates it with index_1 and sometimes index_4?


    Try to run this example (when the resulting stack remains in
    memory) to see that the indexes are identical (stackApply =
    ver_median = lubridate::wday)
    https://gist.github.com/kokkytos/5d554b5a725bb48d2189e2d1fa0e2206

    Thank you again

    On 11/26/19 9:00 PM, Vijay Lulla wrote:
    I'm sorry for the miscommunication.  What I meant to say is that
    the output from stackApply and zApply are the same (because
    zApply uses stackApply internally) except the names.  The
    behavior of stackApply makes sense because AFAIUI R doesn't
    automatically reorder vectors/indices that it receives.  Your
    observation about inconsistent result with ver_mean is very valid
    though!  And, that's what I meant with my comment that using
    sapply with the explicit ordering that you want is the best way
    to control what raster package will output.  In R the input order
    should be maintained (this is the prime difference between SQL
    and R) but packages/tools do not always adhere to this...so it's
    never clear how the output will be ordered.  Sorry for the confusion.


    On Tue, Nov 26, 2019 at 12:22 PM Leonidas Liakos
    <leonidas_lia...@yahoo.gr <mailto:leonidas_lia...@yahoo.gr>> wrote:

        Why do they seem logical since they do not match?

        Check for example index 1 (Sunday). The results are different
        for the three processes

       > stackapply_mean
        class      : RasterBrick
        dimensions : 300, 300, 90000, 7  (nrow, ncol, ncell, nlayers)
        resolution : 500, 500  (x, y)
        extent     : 0, 150000, 0, 150000  (xmin, xmax, ymin, ymax)
        crs        : NA
        source     :
        /tmp/RtmpkRMXLb/raster/r_tmp_2019-11-26_191359_7710_20324.grd
        names      :  index_5,  index_6,  index_7,  index_1, 
        index_2,  index_3,  index_4
        min values : 440.0467, 444.9182, 437.1589, 444.6946,
        440.2028, 429.6900, 442.7436
        max values : 563.8341, 561.7687, 560.4509, 565.8671,
        560.1375, 561.7972, 556.2471


       > ver_mean
        class      : RasterStack
        dimensions : 300, 300, 90000, 7  (nrow, ncol, ncell, nlayers)
        resolution : 500, 500  (x, y)
        extent     : 0, 150000, 0, 150000  (xmin, xmax, ymin, ymax)
        crs        : NA
        names      :  layer.1,  layer.2,  layer.3,  layer.4, 
        layer.5,  layer.6,  layer.7
        min values : 442.7436, 440.0467, 444.9182, 437.1589,
        444.6946, 440.2028, 429.6900
        max values : 556.2471, 563.8341, 561.7687, 560.4509,
        565.8671, 560.1375, 561.7972


       > z
        class      : RasterBrick
        dimensions : 300, 300, 90000, 7  (nrow, ncol, ncell, nlayers)
        resolution : 500, 500  (x, y)
        extent     : 0, 150000, 0, 150000  (xmin, xmax, ymin, ymax)
        crs        : NA
        source     :
        /tmp/RtmpkRMXLb/raster/r_tmp_2019-11-26_191439_7710_04780.grd
        names      :       X1,       X2,       X3,       X4,      
        X5,       X6,       X7
        min values : 440.0467, 444.9182, 437.1589, 444.6946,
        440.2028, 429.6900, 442.7436
        max values : 563.8341, 561.7687, 560.4509, 565.8671,
        560.1375, 561.7972, 556.2471
                   : 1, 2, 3, 4, 5, 6, 7


        On 11/26/19 7:03 PM, Vijay Lulla wrote:
        If you read the code/help for `stackApply` and `zApply`
        you'll see that the results that you obtain make sense (at
        least they seem sensible/reasonable to me).  IMO, if you
        want to control the ordering of your layers then just use
        sapply, like how you've used for ver_mean.  IMO, this is the
        only reliable (safe?), and quite a readable, way to
        accomplish what you're trying to do.
        Just my 2 cents.
        -- Vijay.

        On Tue, Nov 26, 2019 at 11:19 AM Leonidas Liakos via
        R-sig-Geo <r-sig-geo@r-project.org
        <mailto:r-sig-geo@r-project.org>> wrote:

            I added raster::zApply in my tests to validate the
            results. However, the
            indices of the names of the results are different now.
            Recall that the
            goal is to calculate from a raster stack time series the
            mean per day of
            the week. And that problem I have is that stackApply,
            zApply and
            calc/sapply return different indices in the result
            names. New code is
            available here:
            https://gist.github.com/kokkytos/93f315a5ecf59c0b183f9788754bc170
            I'm really curious about missing something.


            On 11/20/19 3:30 AM, Frederico Faleiro wrote:
           > Hi Leonidas,
           >
           > both results are in the same order, but the name is
            different.
           > You can rename the first as in the second:
           > names(res) <- names(res2)
           >
           > I provided an example to help you understand the logic.
           >
           > library(raster)
           > beginCluster(2)
           > r <- raster()
           > values(r) <- 1
           > # simple sequential stack from 1 to 6 in all cells
           > s <- stack(r, r*2, r*3, r*4, r*5, r*6)
           > s
           > res <- clusterR(s, stackApply, args =
            list(indices=c(2,2,3,3,1,1), fun
           > = mean))
           > res
           > res2 <- stackApply(s, c(2,2,3,3,1,1), mean)
           > res2
           > dif <- res - res2
           > # exatly the same order because the difference is zero
            for all layers
           > dif
           > # rename
           > names(res) <- names(res2)
           >
           > Best regards,
           >
           > Frederico Faleiro
           >
           > On Tue, Nov 19, 2019 at 4:15 PM Leonidas Liakos via
            R-sig-Geo
           > <r-sig-geo@r-project.org
            <mailto:r-sig-geo@r-project.org>
            <mailto:r-sig-geo@r-project.org
            <mailto:r-sig-geo@r-project.org>>> wrote:
           >
           >     I run the example with clusterR:
           >
           >     no_cores <- parallel::detectCores() -1
           >     raster::beginCluster(no_cores)
           >     ?????? res <- raster::clusterR(inp,
            raster::stackApply, args =
           >     list(indices=c(2,2,3,3,1,1),fun = mean))
           >     raster::endCluster()
           >
           >     And the result is:
           >
           >     > res
           >     class?????????? : RasterBrick
           >     dimensions : 180, 360, 64800, 3?? (nrow, ncol,
            ncell, nlayers)
           >     resolution : 1, 1?? (x, y)
           >     extent???????? : -180, 180, -90, 90?? (xmin, xmax,
            ymin, ymax)
           >     crs?????????????? : +proj=longlat +datum=WGS84
            +ellps=WGS84
           >     +towgs84=0,0,0
           >     source???????? : memory
           >     names?????????? : layer.1, layer.2, layer.3
           >     min values :???????? 1.5,???????? 3.5,???????? 5.5
           >     max values :???????? 1.5,???????? 3.5,???????? 5.5??
           >
           >
           >     layer.1, layer.2, layer.3 (?)
           >
           >     So what corrensponds to what?
           >
           >
           >     If I run:
           >
           >     res2 <- stackApply(inp,c(2,2,3,3,1,1),mean)
           >
           >     The result is:
           >
           >     > res2
           >     class      : RasterBrick
           >     dimensions : 180, 360, 64800, 3  (nrow, ncol,
            ncell, nlayers)
           >     resolution : 1, 1  (x, y)
           >     extent     : -180, 180, -90, 90  (xmin, xmax,
            ymin, ymax)
           >     crs        : +proj=longlat +datum=WGS84
            +ellps=WGS84 +towgs84=0,0,0
           >     source     : memory
           >     names      : index_2, index_3, index_1
           >     min values :     1.5,     3.5,     5.5
           >     max values :     1.5,     3.5,     5.5
           >
           >     There is no consistency with the names of the
            output and obscure
           >     correspondence with the indices in the case of
            clusterR
           >
           >
           >             [[alternative HTML version deleted]]
           >
           >     _______________________________________________
           >     R-sig-Geo mailing list
           >     R-sig-Geo@r-project.org
            <mailto:R-sig-Geo@r-project.org>
            <mailto:R-sig-Geo@r-project.org
            <mailto:R-sig-Geo@r-project.org>>
           >     https://stat.ethz.ch/mailman/listinfo/r-sig-geo
           >

                    [[alternative HTML version deleted]]

            _______________________________________________
            R-sig-Geo mailing list
            R-sig-Geo@r-project.org <mailto:R-sig-Geo@r-project.org>
            https://stat.ethz.ch/mailman/listinfo/r-sig-geo






        [[alternative HTML version deleted]]

_______________________________________________
R-sig-Geo mailing list
R-sig-Geo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-geo


--
Roger Bivand
Department of Economics, Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; e-mail: roger.biv...@nhh.no
https://orcid.org/0000-0003-2392-6140
https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en
_______________________________________________
R-sig-Geo mailing list
R-sig-Geo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-geo

Reply via email to