[Rd] .Call ref card [was Re: R-devel Digest, Vol 109, Issue 22]

2012-03-22 Thread Ramon Diaz-Uriarte



On Thu, 22 Mar 2012 10:38:55 -0400,Simon Urbanek simon.urba...@r-project.org 
wrote:

 On Mar 22, 2012, at 9:45 AM, Terry Therneau thern...@mayo.edu wrote:

  
  
  
   strongly disagree. I'm appalled to see that sentence here.
   
   Come on!
   
   The overhead is significant for any large vector and it is in 
   particular unnecessary since in .C you have to allocate *and copy* 
   space even for results (twice!). Also it is very error-prone, because 
   you have no information about the length of vectors so it's easy to 
   run out of bounds and there is no way to check. IMHO .C should not be 
   used for any code written in this century (the only exception may be 
   if you are passing no data, e.g. if all you do is to pass a flag and 
   expect no result, you can get away with it even if it is more 
   dangerous). It is a legacy interface that dates way back and is 
   essentially just re-named .Fortran interface. Again, I would strongly 
   recommend the use of .Call in any recent code because it is safer and 
   more efficient (if you don't care about either attribute, well, feel 
   free ;)).
   
   So aleph will not support the .C interface? ;-)
   
  It will look at the timestamp of the source file and delete the package if 
  it is not before 1980 ;). Otherwise it will send a request for punch cards 
  with .C is deprecated, please upgrade to .Call stamped out :P At that 
  point I'll be flaming about using the native Aleph interface and not the R 
  compatibility layer ;)
  
  Cheers,
  S
  I'll dissent -- I don't think .C is inherently any more dangerous than 
  .Call and prefer it's simplicity in many cases.  Calling C at all is what 
  is inherently dangerous -- I can reference beyond the end of a vector, 
  write over objects that should be read only, and branch to random places 
  using either interface. 

 You can always do so deliberately, but with .C you have no way of preventing 
 it since you don't even know what is the length! That is certainly far more 
 dangerous than .Call where you can simply loop over the length, check that 
 the lengths are compatible etc. Also for types like strings .C is a minefield 
 that is hard to not blow up whereas .Call it is even more safe than scalar 
 arrays. You can do none of that with .C which relies entirely on conventions 
 with no recorded semantics.


  If you are dealing with large objects and worry about memory efficiency 
  then .Call puts more tools at your disposal and is worth the effort.  
  However, I did not find the .Call interface at all easy to use at first

 I guess this depends on the developer and is certainly a
 factor. Personally, I find the subset of the R API needed for .Call
 fairly small and intuitive (in particular when you are just writing a
 safer replacement for .C), but I'm obviously biased. Maybe in a separate
 thread we could discuss this - I'd be happy to write a ref card or cheat
 sheet if I find out what people find challenging on .Call. Nonetheless,
 my point is that it is more than worth investing the effort both in
 safety and performance.


After your previous email I made a mental note try to finally learn to
use .Call since I often deal with large objects. So, yes, I'd love to see
a ref card and cheat sheet: I have tried learning to use .Call a few
times, but have always gone back to .C since (it seems that) all I needed
to know are just a couple of conventions, and the rest is C as usual.



You say if I find out what people find challenging on
.Call. Hummm... can I answer basically everything?  I think Terry
Thereneau says, the things I needed to know are scattered about in
multiple places. When I see the convolve example (5.2 in Writing R
extensions) I understand the C code; when I see the convolve2 example in
5.10.1 I think I can guess what lines PROTECT(a ... to xab =
NUMERIC_POINTER ...  might be doing, but I would not know how to do that
on my own. Yes, I can go to 5.9.1 to read about PROTECT, then search for
... But, at that point, I've gone back to .C. Of course, this might just
be my laziness/incompetence/whatever.

 

Best,


R.









  and we should keep that in mind before getting too pompous in our lectures 
  to the sinners of .C.  (Mostly because the things I needed to know are 
  scattered about in multiple places.)
  
  I might have to ask for an exemption on that timestamp -- the first bits of 
  the survival package only reach back to 1986.  And I've had to change 
  source code systems multiple times which plays hob with the file times, 
  though I did try to preserve the changelog history to forstall some future 
  litigious soul who claims they wrote it first  (sccs - rcs - cvs - svn 
  - mercurial).   :-)
  

 ;) Maybe the rule should be based on the date of the first appearance of the 
 package, fair enough :)

 Cheers,
 Simon
   [[alternative HTML version deleted]]

 __
 R-devel@r-project.org mailing list
 

Re: [Rd] .Call ref card [was Re: R-devel Digest, Vol 109, Issue 22]

2012-03-22 Thread peter dalgaard
Don't know how useful it is any more, but back in the days, I gave this talk in 
Vienna

http://www.ci.tuwien.ac.at/Conferences/useR-2004/Keynotes/Dalgaard.pdf

Looking at it now, perhaps it moves a little too quickly into the hairy stuff. 
On the other hand, those were the things that I had found important to figure 
out at the time. At a quick glance, I didn't spot anything obviously outdated. 


On Mar 22, 2012, at 16:15 , Ramon Diaz-Uriarte wrote:

 
 
 
 On Thu, 22 Mar 2012 10:38:55 -0400,Simon Urbanek 
 simon.urba...@r-project.org wrote:
 
 On Mar 22, 2012, at 9:45 AM, Terry Therneau thern...@mayo.edu wrote:
 
 
 
 
 strongly disagree. I'm appalled to see that sentence here.
 
 Come on!
 
 The overhead is significant for any large vector and it is in 
 particular unnecessary since in .C you have to allocate *and copy* 
 space even for results (twice!). Also it is very error-prone, because 
 you have no information about the length of vectors so it's easy to 
 run out of bounds and there is no way to check. IMHO .C should not be 
 used for any code written in this century (the only exception may be 
 if you are passing no data, e.g. if all you do is to pass a flag and 
 expect no result, you can get away with it even if it is more 
 dangerous). It is a legacy interface that dates way back and is 
 essentially just re-named .Fortran interface. Again, I would strongly 
 recommend the use of .Call in any recent code because it is safer and 
 more efficient (if you don't care about either attribute, well, feel 
 free ;)).
 
 So aleph will not support the .C interface? ;-)
 
 It will look at the timestamp of the source file and delete the package if 
 it is not before 1980 ;). Otherwise it will send a request for punch cards 
 with .C is deprecated, please upgrade to .Call stamped out :P At that 
 point I'll be flaming about using the native Aleph interface and not the R 
 compatibility layer ;)
 
 Cheers,
 S
 I'll dissent -- I don't think .C is inherently any more dangerous than 
 .Call and prefer it's simplicity in many cases.  Calling C at all is what 
 is inherently dangerous -- I can reference beyond the end of a vector, 
 write over objects that should be read only, and branch to random places 
 using either interface. 
 
 You can always do so deliberately, but with .C you have no way of preventing 
 it since you don't even know what is the length! That is certainly far more 
 dangerous than .Call where you can simply loop over the length, check that 
 the lengths are compatible etc. Also for types like strings .C is a 
 minefield that is hard to not blow up whereas .Call it is even more safe 
 than scalar arrays. You can do none of that with .C which relies entirely on 
 conventions with no recorded semantics.
 
 
 If you are dealing with large objects and worry about memory efficiency 
 then .Call puts more tools at your disposal and is worth the effort.  
 However, I did not find the .Call interface at all easy to use at first
 
 I guess this depends on the developer and is certainly a
 factor. Personally, I find the subset of the R API needed for .Call
 fairly small and intuitive (in particular when you are just writing a
 safer replacement for .C), but I'm obviously biased. Maybe in a separate
 thread we could discuss this - I'd be happy to write a ref card or cheat
 sheet if I find out what people find challenging on .Call. Nonetheless,
 my point is that it is more than worth investing the effort both in
 safety and performance.
 
 
 After your previous email I made a mental note try to finally learn to
 use .Call since I often deal with large objects. So, yes, I'd love to see
 a ref card and cheat sheet: I have tried learning to use .Call a few
 times, but have always gone back to .C since (it seems that) all I needed
 to know are just a couple of conventions, and the rest is C as usual.
 
 
 
 You say if I find out what people find challenging on
 .Call. Hummm... can I answer basically everything?  I think Terry
 Thereneau says, the things I needed to know are scattered about in
 multiple places. When I see the convolve example (5.2 in Writing R
 extensions) I understand the C code; when I see the convolve2 example in
 5.10.1 I think I can guess what lines PROTECT(a ... to xab =
 NUMERIC_POINTER ...  might be doing, but I would not know how to do that
 on my own. Yes, I can go to 5.9.1 to read about PROTECT, then search for
 ... But, at that point, I've gone back to .C. Of course, this might just
 be my laziness/incompetence/whatever.
 
 
 
 Best,
 
 
 R.
 
 
 
 
 
 
 
 
 
 and we should keep that in mind before getting too pompous in our lectures 
 to the sinners of .C.  (Mostly because the things I needed to know are 
 scattered about in multiple places.)
 
 I might have to ask for an exemption on that timestamp -- the first bits of 
 the survival package only reach back to 1986.  And I've had to change 
 source code systems multiple times which plays hob with the file times, 
 though I