Your code works!
strangelines.txt was created, and it's a text file with just spacebars ...
Seems like a few thousand lines of complete blanks (not 1 non-blank entry).
One thing, when I ran your code there was an error message;
setwd(C:/Users/admin/Desktop/hons/Thesis)
con - file(dataset.txt,
Dear Experienced R Practitioners,
I have 4GB .txt data called dataset.txt and have attempted to use *ff,
bigmemory, filehash and sqldf *packages to import it, but have had no
success. The readLines output of this data is:
readLines(dataset.txt,n=20)
[1]
Jan, thank you.
table(line_sizes)
line_sizes
01 97 256
1430 2860 46869069 1430
-
Isaac
Research Assistant
Quantitative Finance Faculty, UTS
--
View this message in context:
Hi David,
I've tried using sep=\t but it doesn't work, unfortunately.
Thanks for your help.
-
Isaac
Research Assistant
Quantitative Finance Faculty, UTS
--
View this message in context:
http://r.789695.n4.nabble.com/Can-t-import-this-4GB-DATASET-tp4607862p4608936.html
Sent from the R
Thanks to all the suggestions. To the first individual that replied, I can't
do any stuff with unix or perl. All I know is R.
@KEN:
I'm using Windows 7, 64 bit.
@Steve:
Here's the readLines output.. As we can see, lines 1-3 are empty and line 5
is empty, and there's also empty elements after
Hi,
I am mediocre at R, maybe 1000 hours experience, but I received an 8GB
dataset and I don't know what to do with it. I have to do extensive analysis
over it for my Honours thesis.
I can't even import it. I've tried;
- Splitting it up using the free csv-splitter-1.1.zip that seems to be
Ray, your solution works and is indeed faster than mine!
It looks like it's going to take a few days to to 400,000 rows, still, which
is unfortunate.
Steve, thanks for your help, I'll definitely self-teach plyr and data.table.
-
Isaac
Research Assistant
Quantitative Finance Faculty,
Lists are the answer.
LIST-list()
for(i in 1:ncol(results6))
{
LIST[[i]]-lm(results6[,i]~data$observed)
}
You'll now have a 91 entry list of lm(). You can then do something like
this:
LIST2-list()
for(i in 1:length(LIST))
{
LIST2[[i]]-LIST[[i]]$r.squared
}
This should now be a list
##I have 2 columns of data. The first column is unique event IDs that
represent a phone call made to a customer.
###So, if you see 3 entries together in the first column like follows:
matrix(c(call1a,call1a,call1a) )
##then this means that this particular phone call (the first call that's
#The following works:
a-array(rnorm(20),dim=c(10,2))
b-array(rnorm(20),dim=c(10,2))
ab-cbind(a,b)
ab-array(ab,dim=c(10,2,2))
-
Isaac
Research Assistant
Quantitative Finance Faculty, UTS
--
View this message in context:
data-matrix(rnorm(10))
data[c(1,4,6)]-NA
print(data)
data-matrix(data[!is.na(data)])
print(data)
-
Isaac
Research Assistant
Quantitative Finance Faculty, UTS
--
View this message in context:
http://r.789695.n4.nabble.com/problem-in-R-tp4260254p4260976.html
Sent from the R help mailing
I've done a lot of research on this very topic and found a few solutions. But
all the ways I've discovered involve loops.
Applying it to what you want, the best way I've found is to do (stolen from
an experienced R user, of course):
y-array(rnorm(100),dim=c(10,10))
Michael, thank you for your post, I learned a lot.
Why is it that people prefer na.exclude to na.omit?
-
Isaac
Research Assistant
Quantitative Finance Faculty, UTS
--
View this message in context:
http://r.789695.n4.nabble.com/problem-in-R-tp4260254p4263119.html
Sent from the R help
Sorry if there's an easy answer to this problem, but here goes..
*
INTRODUCTION CONTEXT*
I'm creating a function where the number of entries in the lm(y~...)
varies. i.e. depending on the function input I want lm(y~x1+x2+x3),
sometimes I'll want lm(y~x1) only, et cetera.
I've completed this
Hi Bert,
Sorry for the sub-par post.. again. I've revamped my OP, I hope it's
sufficient now.
Thank you.
-
Isaac
Research Assistant
Quantitative Finance Faculty, UTS
--
View this message in context:
HI all,
I'm new to R.
Say I have a multi-layered list called newlist.
str(newlist)
List of 2
$ :List of 5
..$ : num [1:8088] NA 464 482 535 557 ...
..$ : num [1:8088, 1:2] NA 464 482 535 557 ...
..$ : num [1:8088, 1:3] NA 464 482 535 557 ...
..$ : num [1:8088, 1:4] NA 464
Josh, you've solved the problem, fantastic.
Thanks for as.formula() David, that will be of great use in my work.
Next time I'll provide better examples. Thanks for your help all.
-
Isaac
Research Assistant
Quantitative Finance Faculty, UTS
--
View this message in context:
Hi, R newb here. I've coded a function that inputs N dimensional array(s) [or
class=numeric if it's dim=1] of coefficients and tstats, where
dim(coef_matrix)=dim(tstat_matrix), it will then output a same dimension
matrix of coefficients pasted to tstats in brackets pasted to significance
stars.
Okay it's working perfectly now. I restarted R and it worked on my first 5
goes.
Can anybody shed light on how this kind of thing can happen?
-
Isaac
Research Assistant
Quantitative Finance Faculty, UTS
--
View this message in context:
Hi, I'm quite new to R (1 month full time use so far). I have to run loop
regressions VERY often in my work, so I would appreciate some new
methodology that I'm not considering.
#-
Thanks for the advice everyone. All very helpful.
@Bert
Added my information to signature, thanks.
-
Isaac
Research Assistant
Quantitative Finance Faculty, UTS
--
View this message in context:
http://r.789695.n4.nabble.com/Other-ways-to-lm-regression-non-loop-tp4234487p4235654.html
21 matches
Mail list logo