Alessandro,

If you are somewhat inexperienced with C++, I suggest reading Effective C++ by 
Scott Meyers. It's easy to get lost in some of his explanations as they are 
very detailed, but you can just follow his advice, and come back to them later.

Dale Smith, Ph.D.
Senior Financial Quantitative Analyst
Financial & Risk Management Solutions
Fiserv
Office: 678-375-5315
www.fiserv.com

-----Original Message-----
From: rcpp-devel-boun...@r-forge.wu-wien.ac.at 
[mailto:rcpp-devel-boun...@r-forge.wu-wien.ac.at] On Behalf Of Kevin Ushey
Sent: Sunday, November 24, 2013 2:32 PM
To: Alessandro Mammana
Cc: rcpp-devel@lists.r-forge.r-project.org
Subject: Re: [Rcpp-devel] Some beginner questions

Hi Ale,

My guess: the elements are not being initialized in the order you expect.

In fact, class members in C++ are initialized _in the order they are declared 
in the class_, not the order you place them in the initializer list. So, based 
on that, your code tries to first initialize run, then rlen, but rlen depends 
on rlens[0] which has not yet been initialized, and so things go wrong.

If you turn on compiler warnings (-Wall) you get informative errors.
In fact, clang points right at the error for me ;)

test.cpp:16:3: warning: field 'names' will be initialized after field 'rlen' 
[-Wreorder]
  names(as<std::vector<std::string> >(values.attr("levels"))),
  ^
test.cpp:17:3: warning: field 'rlen' will be initialized after field 'run' 
[-Wreorder]
  rlen(rlens[0]), // <--- THIS CAUSES SEGFAULT!!!!
  ^

(sidenote: I highly recommend creating a file '~/.R/Makevars', and inserting 
the line:

    CFLAGS="-g -O2 -Wall -pedantic"
    CXXFLAGS="-g -O2 -Wall -pedantic"

so that your compiler picks out these code smells for you whenever compiling 
C/C++ code with R)

As for your other questions re: copying: RObjects are merely thin wrappers over 
pointers, so copying an RObject does not involve copying all the memory 
encompassing an R object, just the pointer to that object. Rcpp containers will 
always wrap to the R object if the R type matches the container type -- e.g., 
IntegerVectors wrap around R's integer vectors, but force a copy / coercion 
when you have a numeric R vector. Make sure the type of object you think you're 
passing from R matches the container you're using in Rcpp -- check what
mode(rle@lengths) gives you.

All of Rcpp's containers are very light, so I doubt you gain much e.g.
passing an Rcpp::IntegerVector by reference rather than by value.

-Kevin

On Sun, Nov 24, 2013 at 10:08 AM, Alessandro Mammana <mamm...@molgen.mpg.de> 
wrote:
> Dear all,
> I had some problems figuring out how to write some code for iterating 
> through the values of a run-length-encoded factor (Rle). Now I kind of 
> made it work, but I am not sure that the codes does exactly what I 
> expect. My questions are both about Rcpp and about C++ , tell me if 
> this is not the right place to ask them.
>
> The function I am writing should iterate through an object of formal 
> class 'Rle' (from the "IRanges" packages), which it's like this:
> 1. It has two slots: 'values' and 'lengths'. They have the same 
> length, values is a factor and lengths is a integer vector.
> 2. values is a factor: an integer vector with an associated character 
> vector (attribute "levels"), and the integer vector points to elements 
> in the character vector.
>
> For instance, the factor f= factor(c('a','a','a','a','b','c','c'))
> when it is run-lenght-encoded rle=Rle(f), it looks like this:
> rle@values ~ c(1, 2, 3)
> attributes(rle@values)$levels ~ c("a","b","c") rle@lengths ~ c(3,1,2)
>
> To make things a bit more complicated, in my situation this Rle object 
> is contained in a GRanges object 'gr': rle = gr@seqnames
>
> I wanted to write the code for a class that encapsulates the iteration 
> through such an object (maybe that's a bit java-style). And that was 
> my first version that compiled:
>
> class rleIter {
>     int run;
>     int rlen;
>     int rpos;
> //should I declare them references if I don't want any unnecessary copying?
>     IntegerVector rlens;
>     IntegerVector values;
>     std::vector<std::string> names;
>     public:
>         rleIter(RObject& rle):
>             rlens(as<IntegerVector>(rle.slot("lengths"))), // is here 
> the vector copied?
>             values(as<IntegerVector>(rle.slot("values"))),
>             names(as<std::vector<std::string> >(values.attr("levels"))),
>             rlen(rlens[0]), // <--- THIS CAUSES SEGFAULT!!!!
>             run(0), rpos(0)
>         {}
>
>         bool next(){
>             ++rpos;
>             if (rpos == rlens[run]){ //end of the run, go to the next
>                 ++run; rpos = 0;
>                 if (run == rlens.length())
>                     return false;
>             }
>             return true;
>         }
>
>         const std::string& getValue(){
>             return names[values[run]-1];
>         }
>
> };
>
>
> void readRle(RObject gr){ //passed in by value (it was a mistake)
>     RObject rle = as<RObject>(gr.slot("seqnames")); //<- is this 
> vector copied here?
>     rleIter iter(rle);
>     bool finished = false;
>     for (; !finished; finished = !iter.next()){
>         Rcout << iter.getValue() << std::endl;
>     }
> }
>
> // [[Rcpp::export]]
> void test(RObject gr){
>     readRle(gr);
> }
>
> in R:
>
> library(GenomicRanges)
> gr <- GRanges(seqnames=c("chr1", "chr1","chr2"),
> ranges=IRanges(start=c(1,10,7),end=c(10,101,74)))
> library(my_package_under_development_with_the_rcpp_code_shown_above)
> test(gr)
>
> SEGFAULT
>
> Questions:
>
> 1. This code gives segfault at the point that I indicated. Why? Maybe 
> I am pointing within the initializer list to areas of memory that are 
> allocated and filled in in the initializer list and maybe this is 
> forbidden?
> 2. If I change the signature of the function readRle and I pass the gr 
> object by reference, the segfault dissappears, why? If I copy the gr 
> object the copy should be identical, why do they have different 
> behaviours?
> 3. I don't understand if doing:
> RObject rle = as<RObject>(gr.slot("seqnames")); causes the vector rle 
> to be copied, and, what is worse, I have no idea about what resources 
> to look up to find it out, or what reasoning/principles to think 
> about, other than posting in this mailing list or attempting to look 
> at the source code for hours...
> 4. If I replace the line above with:
> RObject& rle = as<RObject>(gr.slot("seqnames")); so that I am sure 
> that the vector is not copied, the compiler complains saying that
> as<RObject>(gr.slot("seqnames")) is an rvalue, and if I want to 
> reference it, the reference should be constant. How do I create a 
> non-constant reference to a slot of a s4 object then?
>
> If you made it through the end of this very long and boring email and 
> if you could give me some help I would be extremely grateful.
>
> Ale
>
> --
> Alessandro Mammana, PhD Student
> Max Planck Institute for Molecular Genetics Ihnestraße 63-73
> D-14195 Berlin, Germany
> _______________________________________________
> Rcpp-devel mailing list
> Rcpp-devel@lists.r-forge.r-project.org
> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-deve
> l
_______________________________________________
Rcpp-devel mailing list
Rcpp-devel@lists.r-forge.r-project.org
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
_______________________________________________
Rcpp-devel mailing list
Rcpp-devel@lists.r-forge.r-project.org
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel

Reply via email to