On 06/26/2014 05:18 PM, Sandip Nandi wrote:
Hi ,
I have asked a question , whether the data structure I am using to
create a dataframe is fine or there is anyother way i can use. My aim is
to read a database and write it to dataframe and do operation on it .
The dataframe creation ,output everything works . The code I put is
wrong , trying to adding pieces and do it ,sorry for that. I feel my
way of doing , creating a 2D array may not be the best, so if someone
can point out any drawback of my method will be great . My code in
production can read 100k rows and write in 15 seconds . But one case ,
when I try to assign NA_REAL to a real vector it causes floating point
exception. So I doubt something is not wrong . People may be doing
faster,efficient way.
Please understand that the code you send is useful for the discussion
only if we can understand it. And for this it needs to make sense.
The code below still makes little sense. Did you try it? For example
you're calling SET_VECTOR_ELT() and setAttrib() on an SEXP ('df') that
you didn't even allocate. Sounds maybe like a detail but because of
that the code will segfault and, more importantly, it's not clear what
kind of SEXP you want 'df' to be.
Also the following line makes no sense:
setAttrib(df,R_RowNamesSymbol,lsnm);
given that 'lsnm' is c("int", "string") so it looks more like the col
names than the row names (and also because you're apparently trying to
make a 3x2 data.frame, not a 2x2).
Anyway, once you realize that a data.frame is just a list with 3
attributes:
> attributes(data.frame(int=c(99,89,12), string=c("aa", "vv", "gy")))
$names
[1] "int" "string"
$row.names
[1] 1 2 3
$class
[1] "data.frame"
everything becomes simple at the C level i.e. just make that list
and stick these 3 attributes on it. You don't need to call R code
from C (which BTW will protect you from random changes in the behavior
of the data.frame() constructor). You don't need the intermediate
'valueVector' data structure (what you seem to be referring to as the
"2D array of SEXP", don't know why, doesn't look like a 2D array to me,
but you never explained).
Cheers,
H.
This is a sample code
*/**
*
*
*dfm is a dataframe which i assume as list of list . So I created a SEXP
array valueVector[2] where each one can hold different datatype . Now
values are assigned and dataframe is generated at end*
*
*
**/*
SEXP formDF() {
SEXP dfm ,head,df , dfint , dfStr,lsnm;
SEXP valueVector[2];
char *ab[3] = {"aa","vv","gy"};
int sn[3] ={99,89,12};
char *listnames[2] = {"int","string"};
int i,j;
PROTECT(valueVector[0] = allocVector(REALSXP,3));
PROTECT(valueVector[1] = allocVector(STRSXP,3));
PROTECT(lsnm = allocVector(STRSXP,2));
SET_STRING_ELT(lsnm,0,mkChar("int"));
SET_STRING_ELT(lsnm,1,mkChar("string"));
for ( i = 0 ; i < 3; i++ ) {
SET_STRING_ELT(valueVector[1],i,mkChar(ab[i]));
REAL(valueVector[0])[i] = sn[i];
}
SET_VECTOR_ELT(df,1,valueVector[0]);
SET_VECTOR_ELT(df,0,valueVector[1]);
setAttrib(df,R_RowNamesSymbol,lsnm);
PROTECT(dfm=lang3(install("data.frame"),df,ScalarLogical(FALSE)));
SET_TAG(CDDR(dfm), install("stringsAsFactors")) ;
SEXP res = PROTECT(eval(dfm,R_GlobalEnv));
UNPROTECT(7);
return res;
}
On Thu, Jun 26, 2014 at 4:52 PM, Hervé Pagès <hpa...@fhcrc.org
<mailto:hpa...@fhcrc.org>> wrote:
Hi Sandip,
On 06/26/2014 04:21 PM, Sandip Nandi wrote:
Hi ,
I have put incomplete code here . The complete code works , My
doubt is
, what I am doing logical/safe ? Any memory leak going to happen
? is
there any way to create dataframe ?
I still don't believe it "works". It doesn't even compile. More below...
SEXP formDF() {
SEXP dfm ,head,df , dfint , dfStr,lsnm;
SEXP valueVector[2];
char *ab[3] = {"aa","vv","gy"};
int sn[3] ={99,89,12};
char *listnames[2] = {"int","string"};
int i,j;
PROTECT(df = allocVector(VECSXP,2));
PROTECT(valueVector[0] = allocVector(REALSXP,3));
PROTECT(valueVector[1] = allocVector(VECSXP,3));
PROTECT(lsnm = allocVector(STRSXP,2));
SET_STRING_ELT(lsnm,0,mkChar("__int"));
SET_STRING_ELT(lsnm,1,mkChar("__string"));
SEXP rawvec,headr;
for ( i = 0 ; i < 3; i++ ) {
SET_STRING_ELT(valueVector[1],__0,mkChar(listNames[i]));
'listNames' is undeclared (C is case-sensitive).
Let's assume you managed to compile this with an (imaginary)
case-insensitive C compiler, 'listnames' is an array of length
2 and this for loop tries to read the 3 first elements
from it. So you're just lucky that you didn't get a segfault.
In any case, I don't see how this code could produce
the data.frame you're trying to make.
If you want to discuss how to improve code that *works* (i.e.
compiles and produces the expected result), that's fine, but you
should be able to show that code. Otherwise it sounds like you're
asking people to fix your code. Or to write it for you. Maybe
that's fine too but people will be more sympathetic and willing
to help if you're honest about it.
Cheers,
H.
REAL(valueVector[0])[i] = sn[i];
}
SET_VECTOR_ELT(df,1,__valueVector[0]);
SET_VECTOR_ELT(df,0,__valueVector[1]);
setAttrib(df,R_RowNamesSymbol,__lsnm);
PROTECT(dfm=lang3(install("__data.frame"),df,ScalarLogical(__FALSE)));
SET_TAG(CDDR(dfm), install("stringsAsFactors")) ;
SEXP res = PROTECT(eval(dfm,R_GlobalEnv))__;
UNPROTECT(7);
return res;
}
On Thu, Jun 26, 2014 at 3:49 PM, Hervé Pagès <hpa...@fhcrc.org
<mailto:hpa...@fhcrc.org>
<mailto:hpa...@fhcrc.org <mailto:hpa...@fhcrc.org>>> wrote:
Hi,
On 06/26/2014 02:32 PM, Sandip Nandi wrote:
Hi ,
For our production package i need to create a
dataframein C . So
I wrote
the following code
SEXP dfm ,head,df , dfint , dfStr,lsnm;
*SEXP valueVector[2];*
char *ab[3] = {"aa","vv","gy"};
int sn[3] ={99,89,12};
char *listnames[2] = {"int","string"};
int i,j;
//============================____=
PROTECT(df = allocVector(VECSXP,2));
*PROTECT(valueVector[0] = allocVector(REALSXP,3));*
*PROTECT(valueVector[1] = allocVector(VECSXP,3));*
PROTECT(lsnm = allocVector(STRSXP,2));
SET_STRING_ELT(lsnm,0,mkChar("____int"));
SET_STRING_ELT(lsnm,1,mkChar("____string"));
SEXP rawvec,headr;
unsigned char str[24]="abcdef";
for ( i = 0 ; i < 3; i++ ) {
*SET_STRING_ELT(valueVector[1]____,i,mkChar(ab[i]));*
*REAL(valueVector[0])[i] = sn[i];*
}
It works , data frame is being created and executed
properly .
Really? You mean, you can compile this code right?
Otherwise it's
incomplete: you allocate but do nothing with 'df'. Same
with 'lsnm'.
And you don't UNPROTECT. With no further treatment, 'df'
will be an
unnamed list containing junk data, but not the data.frame
you expect.
So there are a few gaps that would need to be filled before
this code
actually works as intended.
Maybe try and come back again with specific questions?
Cheers,
H.
> Just curious , if I am doing anything wrong or is there
another
way around
for creation of data-frame . I am concerned about the
SEXP 2D
array .
Thanks,
Sandip
[[alternative HTML version deleted]]
__________________________________________________
R-devel@r-project.org <mailto:R-devel@r-project.org>
<mailto:R-devel@r-project.org <mailto:R-devel@r-project.org>>
mailing list
https://stat.ethz.ch/mailman/____listinfo/r-devel
<https://stat.ethz.ch/mailman/__listinfo/r-devel>
<https://stat.ethz.ch/mailman/__listinfo/r-devel
<https://stat.ethz.ch/mailman/listinfo/r-devel>>
--
Hervé Pagès
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024
E-mail: hpa...@fhcrc.org <mailto:hpa...@fhcrc.org>
<mailto:hpa...@fhcrc.org <mailto:hpa...@fhcrc.org>>
Phone: (206) 667-5791 <tel:%28206%29%20667-5791>
<tel:%28206%29%20667-5791>
Fax: (206) 667-1319 <tel:%28206%29%20667-1319>
<tel:%28206%29%20667-1319>
--
Hervé Pagès
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024
E-mail: hpa...@fhcrc.org <mailto:hpa...@fhcrc.org>
Phone: (206) 667-5791 <tel:%28206%29%20667-5791>
Fax: (206) 667-1319 <tel:%28206%29%20667-1319>
--
Hervé Pagès
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024
E-mail: hpa...@fhcrc.org
Phone: (206) 667-5791
Fax: (206) 667-1319
______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel