Re: [R] fortune nomination WAS: Re: How long does it take to learn the R programming language?

2022-09-29 Thread Richard O'Keefe
"R longa, vita brevis."

On Thu, 29 Sept 2022 at 07:02, Berry, Charles 
wrote:

> Aha!
> CCB
>
> > On Sep 27, 2022, at 6:08 PM, Rolf Turner 
> wrote:
> >
> >
> > On Mon, 26 Sep 2022 11:14:57 +0800
> > Turritopsis Dohrnii Teo En Ming  wrote:
> >
> >> Subject: How long does it take to learn the R programming language?
> >>
> >> Good day from Singapore,
> >>
> >> How long does it take to learn the R programming language?
> >
> > How long is a piece of string? :-)
> >
> > cheers,
> >
> > Rolf Turner
> >
> > --
> > Honorary Research Fellow
> > Department of Statistics
> > University of Auckland
> > Phone: +64-9-373-7599 ext. 88276
> >
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reading very large text files into R

2022-09-29 Thread Richard O'Keefe
If I had this problem, in the old days I'd've whipped up
a tiny AWK script.  These days I might use xsv or qsv.
BUT
first I would want to know why these extra fields are
present and what they signify.  Are they good data that
happen not to be described in the documentation?  Do
they represent a defect in the generation process?  What
other discrepancies are there?  If the data *format*
cannot be fully trusted, what does that say about the
data *content*?  Do other data sets from the same source
have the same issue?  Is it possible to compare this
version of the data with an earlier version?

On Fri, 30 Sept 2022 at 02:54, Nick Wray  wrote:

> Hello   I may be offending the R purists with this question but it is
> linked to R, as will become clear.  I have very large data sets from the UK
> Met Office in notepad form.  Unfortunately,  I can’t read them directly
> into R because, for some reason, although most lines in the text doc
> consist of 15 elements, every so often there is a sixteenth one and R
> doesn’t like this and gives me an error message because it has assumed that
> every line has 15 elements and doesn’t like finding one with more.  I have
> tried playing around with the text document, inserting an extra element
> into the top line etc, but to no avail.
>
> Also unfortunately you need access permission from the Met Office to get
> the files in question so this link probably won’t work:
>
> https://catalogue.ceda.ac.uk/uuid/bbd6916225e7475514e17fdbf11141c1
>
> So what I have done is simply to copy and paste the text docs into excel
> csv and then read them in, which is time-consuming but works.  However the
> later datasets are over the excel limit of 1048576 lines.  I can paste in
> the first 1048576 lines but then trying to isolate the remainder of the
> text doc to paste it into a second csv doc is proving v difficult – the
> only way I have found is to scroll down by hand and that’s taking ages.  I
> cannot find another way of editing the notepad text doc to get rid of the
> part which I have already copied and pasted.
>
> Can anyone help with a)ideally being able to simply read the text tables
> into R  or b)suggest a way of editing out the bits of the text file I have
> already pasted in without laborious scrolling?
>
> Thanks Nick Wray
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Deprecating download method='wininet' in R on Windows causes trouble with corporate proxy

2022-09-29 Thread Selke, Gisbert W.
On Fri, Sep 30, 2022 at 01:27 Henrik Bengtsson  
wrote

> Is R centrally installed?  
It usually (but not always...) is installed through a central install process, 
but installations will not be kept synchronized. (I.e., we end up having 
independent installations, many different versions, many different collections 
of packages (and versions thereof).)

> If so, environment variables 'HTTP_PROXY', 'HTTPS_PROXY', and 
> 'HTTPS_PROXY_USER' could 
> be set for all users by setting them in the R_HOME/etc/Renviron.site file.  
Yes, this can be delivered centrally. The problem arises if and when the 
variables for the proxy must be adapted (e.g., proxy URL changes). 
I am toying with the idea to source, from the Rprofile.site, a central 
configuration file (specified as a UNC path). Then, if the values of the 
environment variables need to be changed, this is done once for all, and 
roll-out to users will be immediate. 
[...]
> At least this avoid having to configure them in MS Windows settings, which is 
> tedious to document and explain.
I fully agree!

Thank you for your advice, Henrik!
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Converting a Date variable from character to Date

2022-09-29 Thread avi.e.gross
I am not replying to the earlier request just to the part right below my
message.

A simple suggestion when sending people code is to add NOTHING except proper
comments.

Can we assume the extra asterisks are superfluous and not in your code?

I mean your column is named "Period" and not "*Period" and your meaningless
call to format(...) was not to *format(...)* ...

And I note in R upper and lower case are not interchangeable. CPI as a
column name does not match:

class(inflation.2$cpi)

I am not clear what the above is supposed to do. What do you want to set the
class of a column in a data.frame to? If I am guessing correctly, the normal
way people do a change from one TYPE to another looks more like:

after <- as.character(before)

In your case, your data is of type character and you want to make it a date
of one kind or another. If you do a little searching, you may find a bunch
of ways to convert properly formatted strings to dates or date/time types.
Your data is NOT a standard date format so none of the standard ones will
work.

I am guessing  "2022m1" may mean first month in 2022 and goes as high as
2022m12 before shifting to 2023. Good luck with that. It is far easier if
your data looked like "2022-01-01" or some such format that might be read
easily. You need to do one of many things I will not show here to break that
date into parts or have it parsed properly as with a function like
strptime() using a package. 

As a general comment, I hope your meaning of command line is within the R in
interpreter rather than other meanings like for some shell utility.

And note that generally the R method of handling a data.frame using base R
or a package like dplyr requires most changes to be saved into the same or a
new variable. Your sample code makes no sense to me. 

So assuming at some point your code got the data you want into a data.frame
with a character column called inflation.1$Period, then base R would allow
you to call some function that does the conversion, which I am calling
doit() here) this way:

inflation.1$Period <- doit(inflation.1$Period)

Good Luck. You need to show a bit more knowledge of R before people can help
you with more advanced tasks.


-Original Message-
From: R-help  On Behalf Of Admire Tarisirayi
Chirume
Sent: Thursday, September 29, 2022 12:36 PM
To: Jeff Newmiller 
Cc: r-help mailing list 
Subject: [R] Converting a Date variable from character to Date

Kindly request assistance to *convert a Date variable from a character to be
recognized as a date*.
NB: kindly take note that the data is in a csv file called *inflation*. I
have included part of the file content herewith with the header for
assistance.


My data looks like this:
*PeriodCPI*
2022m1 4994
2022m2 5336
2022m3 5671
2022m4 6532
2022m5 7973
2022m610365
2022m712673
2022m814356
2022m914708

 I used the following command lines.


class(inflation.2$cpi)
inflation.2$cpi <- as.numeric(as.character(inflation.2$cpi))
*format(as.Date(inflation.2$period), "%Y-%m")*

Having run the command lines above, the variable *period* in the attached
CSV file remains being read as a character variable. Kindly assist.

Thank you.


Alternative email: addtar...@icloud.com/tchir...@rbz.co.zw
Skype: admirechirume
Call: +263773369884
whatsapp: +818099861504

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Deprecating download method='wininet' in R on Windows causes trouble with corporate proxy

2022-09-29 Thread Henrik Bengtsson
Is R centrally installed?  If so, environment variables 'HTTP_PROXY',
'HTTPS_PROXY', and 'HTTPS_PROXY_USER' could be set for all users by
setting them in the R_HOME/etc/Renviron.site file.  R_HOME is the
folder where R is installed.  You can find this file from within R by
calling:

> file.path(R.home("etc"), "Renviron")
[1] "C:/PROGRA~1/R/R-42~1.1/etc/Renviron"

If not centrally installed, I don't know anything better than users
setting them in their personal ~/.Renviron file;

> normalizePath("~/.Renviron")
[1] "C:\\Users\\alice\\Documents\\.Renviron"

For example,

> cat(file = "~/.Renviron", append = TRUE, 
> "HTTP_PROXY=http://proxy-host:3128/;, "HTTPS_PROXY=https://proxy-host:3128/;, 
> "HTTPS_PROXY_USER=dummy", sep = "\n")

At least this avoid having to configure them in MS Windows settings,
which is tedious to document and explain.

My $.02

Henrik

On Thu, Sep 29, 2022 at 3:48 PM Selke, Gisbert W.
 wrote:
>
> Method="wininet" is deprecated and scheduled to go away, the standard method 
> is now libcurl. This causes trouble for all R users in our shop, because we 
> are sitting behind a corporate proxy, which uses Kerberos authentication. 
> (We're all on Windows.)
>
> Using wininet, this used to work without problems and without additional 
> effort; it currently still does with explicit method="wininet" (which, by the 
> way, precludes use of the handy menu command "Update packages", which will 
> use the default method, i.e., libcurl as of now.)
>
> For the future, when wininet will be gone for good, the only option we have 
> is to resort to first setting environment variables HTTP_PROXY and 
> HTTPS_PROXY and then tricking the proxy out of using Kerberos, setting 
> HTTPS_PROXY_USER to a dummy string.
> This is certainly doable for R users with enough knowledge of the 
> technicalities of internet access, but our average R user will just be lost. 
> As has been pointed out elsewhere 
> (https://github.com/rstudio/rstudio/issues/10163#issuecomment-1154071514) , 
> this will create a lot of blood, sweat and tears (and swears), and it is a 
> moderate nightmare to maintain consistently and up-to-date for many users.
>
> My first question is: Since we are probably not the only institution in this 
> situation, has anyone come up with a robust and maintainable solution other 
> than our approach described above?
>
> Failing that: would it be possible at all to change the use that the R core 
> makes of libcurl in such a way that it would automagically Do The Right Thing 
> (tm)? In principle, this should be possible; after all, wininet did the 
> trick, and ordinary browsers can handle this situation. (Disclaimer: I know 
> nothing about the R internals so cannot say whether I am being overly naïve 
> here.)
>
> Any help appreciated.
>
> (I'm new to this list, so if this has been discussed here before, I apologize 
> and would be grateful for a pointer to do my reading.)
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Deprecating download method='wininet' in R on Windows causes trouble with corporate proxy

2022-09-29 Thread Selke, Gisbert W.
Method="wininet" is deprecated and scheduled to go away, the standard method is 
now libcurl. This causes trouble for all R users in our shop, because we are 
sitting behind a corporate proxy, which uses Kerberos authentication. (We're 
all on Windows.)

Using wininet, this used to work without problems and without additional 
effort; it currently still does with explicit method="wininet" (which, by the 
way, precludes use of the handy menu command "Update packages", which will use 
the default method, i.e., libcurl as of now.)

For the future, when wininet will be gone for good, the only option we have is 
to resort to first setting environment variables HTTP_PROXY and HTTPS_PROXY and 
then tricking the proxy out of using Kerberos, setting HTTPS_PROXY_USER to a 
dummy string.
This is certainly doable for R users with enough knowledge of the 
technicalities of internet access, but our average R user will just be lost. As 
has been pointed out elsewhere 
(https://github.com/rstudio/rstudio/issues/10163#issuecomment-1154071514) , 
this will create a lot of blood, sweat and tears (and swears), and it is a 
moderate nightmare to maintain consistently and up-to-date for many users.

My first question is: Since we are probably not the only institution in this 
situation, has anyone come up with a robust and maintainable solution other 
than our approach described above?

Failing that: would it be possible at all to change the use that the R core 
makes of libcurl in such a way that it would automagically Do The Right Thing 
(tm)? In principle, this should be possible; after all, wininet did the trick, 
and ordinary browsers can handle this situation. (Disclaimer: I know nothing 
about the R internals so cannot say whether I am being overly na�ve here.)

Any help appreciated.

(I'm new to this list, so if this has been discussed here before, I apologize 
and would be grateful for a pointer to do my reading.)

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] issue running svyglm after subsetting: NA/NaN/Inf in foreign function call (arg 1)

2022-09-29 Thread Felippe Marcondes
Hello,

I am attempting to run 1 svyglm model for each of the levels of a factor
variable.
When I use the subset function in the survey design object, I get
the following error:

Error in qr.default(weights(design, "analysis"), tol = 1e-05) :
  NA/NaN/Inf in foreign function call (arg 1)

I am using the api data for a minimal reproducible example.

# loading package and data
library(survey)
data(api)

# creating the svyrep design object
dclus1<-svydesign(id=~dnum, weights=~pw, data=apiclus1, fpc=~fpc)

rclus1<-as.svrepdesign(dclus1)


# attempting to sun svyglm model with subsetted design object:
t <- svyglm(awards ~ comp.imp + api99 + api00 + cname + cnum + meals + ell,
design = subset(rclus1, as.factor(stype=="E")), family = quasibinomial)

I get the following error:
Error in qr.default(weights(design, "analysis"), tol = 1e-05) :
  NA/NaN/Inf in foreign function call (arg 1)

How do I properly subset the design object by each level of the stype
variable for the svyglm model to run?

Thanks,

Felippe

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Converting a Date variable from character to Date

2022-09-29 Thread Rui Barradas

Hello,

You have to paste a day begore coercing to class "Date". A usual choice 
for this is day 1.



inflation.2 <- 'PeriodCPI
2022m1 4994
2022m2 5336
2022m3 5671
2022m4 6532
2022m5 7973
2022m610365
2022m712673
2022m814356
2022m914708
'
inflation.2 <- read.table(textConnection(inflation.2), header = TRUE)

inflation.2$Period2 <- as.Date(paste(inflation.2$Period, 1), "%Ym%m %d")
inflation.2
#>   Period   CPIPeriod2
#> 1 2022m1  4994 2022-01-01
#> 2 2022m2  5336 2022-02-01
#> 3 2022m3  5671 2022-03-01
#> 4 2022m4  6532 2022-04-01
#> 5 2022m5  7973 2022-05-01
#> 6 2022m6 10365 2022-06-01
#> 7 2022m7 12673 2022-07-01
#> 8 2022m8 14356 2022-08-01
#> 9 2022m9 14708 2022-09-01

format(inflation.2$Period2, "%Y-%m")
#> [1] "2022-01" "2022-02" "2022-03" "2022-04" "2022-05" "2022-06" "2022-07"
#> [8] "2022-08" "2022-09"

zoo::as.yearmon(inflation.2$Period2)
#> [1] "Jan 2022" "Feb 2022" "Mar 2022" "Apr 2022" "May 2022" "Jun 2022" 
"Jul 2022"

#> [8] "Aug 2022" "Sep 2022"


Hope this helps,

Rui Barradas

Às 17:35 de 29/09/2022, Admire Tarisirayi Chirume escreveu:

Kindly request assistance to *convert a Date variable from a character to
be recognized as a date*.
NB: kindly take note that the data is in a csv file called *inflation*. I
have included part of the file content herewith with the header for
assistance.


My data looks like this:
*PeriodCPI*
2022m1 4994
2022m2 5336
2022m3 5671
2022m4 6532
2022m5 7973
2022m610365
2022m712673
2022m814356
2022m914708

  I used the following command lines.


class(inflation.2$cpi)
inflation.2$cpi <- as.numeric(as.character(inflation.2$cpi))
*format(as.Date(inflation.2$period), "%Y-%m")*

Having run the command lines above, the variable *period* in the attached
CSV file remains being read as a character variable. Kindly assist.

Thank you.


Alternative email: addtar...@icloud.com/tchir...@rbz.co.zw
Skype: admirechirume
Call: +263773369884
whatsapp: +818099861504


On Thu, Sep 29, 2022 at 6:10 PM Jeff Newmiller 
wrote:


Your attachment was stripped by the mailing list. The criteria for allowed
attachments are a bit tricky to translate into actions to apply to your
email software, so usually including part of your file in the body of the
email is the most successful approach for communicating your problem. Be
sure to use a text editor or the

   readLines("filename.csv") |> head() |> dput()

functions in R to extract lines of your file for inclusion in the email.

On September 29, 2022 8:52:30 AM PDT, Admire Tarisirayi Chirume <
atchir...@gmail.com> wrote:

I kindly request for assistance to convert a Date variable from a

character

to be recognised as a date. I used the following command lines.

inflation<-read.csv("Inflation_forecasts_1.csv")
attach(inflation)
inflation[,1:2 ] #subsetting the dataframe
#Renaming variables
inflation<- rename(inflation.df,
   cpi = CPI,
   year=period)

#subsetting data April 2020 to current
inflation.2<-data.frame(inflation[-c(1:135),])
class(inflation.2$cpi)
inflation.2$cpi <- as.numeric(as.character(inflation.2$cpi))
* format(as.Date(inflation.2$period), "%Y-%m")*

Having ran the command lines above, the variable period in the attached

csv

file remains being read as a character variable. Kindly assist.

Thank you.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide

http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


--
Sent from my phone. Please excuse my brevity.





__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reading very large text files into R

2022-09-29 Thread Dr Eberhard W Lisse
To me this file looks like a CSV with 15 fields (on each line) not 16,
the last field being empty with the exception of the one which has the
'B'.  The 14th is always empty.

I also note that it does not seem to have a new line at the end.


I can strongly recommend QSV to manipulate CSV files and CSVIEW to look
at them

After renaming the file for convenience you can do something like

qsv input --trim-fields --trim-headers sample.csv \
| qsv select -n "1,2,6,7,8,9,10" \
| qsv rename "date,c2,type,c4,c5,c6,c7" \
| csview -i5 -np0

and get something like

┌──┬┬──┬───┬┬┬──┬──┐
│# │  date  │  c2  │ type  │ c4 │ c5 │c6│c7│
├──┼┼──┼───┼┼┼──┼──┤
│1 │1980-01-01 10:00│226918│WAHRAIN│5124│1001│0 │  │
│2 │1980-01-01 10:00│228562│WAHRAIN│491 │1001│0 │  │
│3 │1980-01-01 10:00│231581│WAHRAIN│5213│1001│0 │  │
│4 │1980-01-01 10:00│232671│WAHRAIN│487 │1001│0 │  │
│5 │1980-01-01 10:00│232913│WAHRAIN│5243│1001│0 │  │
│6 │1980-01-01 10:00│234362│WAHRAIN│5265│1001│0 │  │
│7 │1980-01-01 10:00│234682│WAHRAIN│5271│1001│0 │  │
│8 │1980-01-01 10:00│235389│WAHRAIN│5279│1001│0 │  │
│9 │1980-01-01 10:00│236466│WAHRAIN│497 │1001│0 │  │
│10│1980-01-01 10:00│243350│SREW   │484 │1001│0 │  │
│11│1980-01-01 10:00│243350│WAHRAIN│484 │1001│0 │0 │
└──┴┴──┴───┴┴┴──┴──┘

As the files do not have headers, you could, if you have multiple files,
even do something like

qsv cat rows s*.csv \
| qsv input --trim-fields --trim-headers \
| qsv select -n "1,2,6,7,8,9,10" \
| qsv rename "date,c2,type,c4,c5,c6,c7" \
| qsv dedup 2>/dev/null -o readmeintoR.csv


If it was REALLY a file with different numbers of fields you can use
CSVQ and do something like

cat s*csv \
| csvq --format CSV --no-header --allow-uneven-fields \
"SELECT c1 as date, c2, c6 as type, c7 as c4,
  c8 as c5, c9 as c6, c10 as c7
FROM stdin" \
| qsv input --trim-fields --trim-headers \
| qsv dedup 2>/dev/null -o readmeintoR.csv

And, finally, depending on how long the reading of the CSV takes, I
would save it into a RDS, loading of which is very fast.


greetings, el

On 2022-09-29 17:26 , Nick Wray wrote:
> Hi Bert   
> 
> Right Thing is, I didn't know that there even was an instruction like
> read.csv(text = "...  your text...  ") so at any rate I can paste the
> original text files in by hand if there's no shorter cut
> Thanks v much Nick
[...]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Converting a Date variable from character to Date

2022-09-29 Thread Admire Tarisirayi Chirume
Thank you for the code. It helped.
I greatly appreciate.

Alternative email: addtar...@icloud.com/tchir...@rbz.co.zw
Skype: admirechirume
Call: +263773369884
whatsapp: +818099861504


On Thu, Sep 29, 2022 at 7:20 PM jim holtman  wrote:

> Try this by add a "day" to the date field
>
> library(tidyverse)
> library(lubridate)
> input <- "*PeriodCPI*
> 2022m1 4994
> 2022m2 5336
> 2022m3 5671
> 2022m4 6532
> 2022m5 7973
> 2022m610365
> 2022m712673
> 2022m814356
> 2022m914708"
>
> m_data <- read.delim(text = input, sep = "")
>
> # convert the date by adding a "day" before the conversion
>
> m_data$date <- ymd(paste0(m_data$X.Period, '-1'))
> m_data
>
> ##   X.Period  CPI.   date
> ## 1   2022m1  4994 2022-01-01
> ## 2   2022m2  5336 2022-02-01
> ## 3   2022m3  5671 2022-03-01
> ## 4   2022m4  6532 2022-04-01
> ## 5   2022m5  7973 2022-05-01
> ## 6   2022m6 10365 2022-06-01
> ## 7   2022m7 12673 2022-07-01
> ## 8   2022m8 14356 2022-08-01
> ## 9   2022m9 14708 2022-09-01
>
>
>
> Thanks
>
> Jim Holtman
> *Data Munger Guru*
>
>
> *What is the problem that you are trying to solve?Tell me what you want to
> do, not how you want to do it.*
>
>
> On Thu, Sep 29, 2022 at 9:36 AM Admire Tarisirayi Chirume <
> atchir...@gmail.com> wrote:
>
>> Kindly request assistance to *convert a Date variable from a character to
>> be recognized as a date*.
>> NB: kindly take note that the data is in a csv file called *inflation*. I
>> have included part of the file content herewith with the header for
>> assistance.
>>
>>
>> My data looks like this:
>> *PeriodCPI*
>> 2022m1 4994
>> 2022m2 5336
>> 2022m3 5671
>> 2022m4 6532
>> 2022m5 7973
>> 2022m610365
>> 2022m712673
>> 2022m814356
>> 2022m914708
>>
>>  I used the following command lines.
>>
>>
>> class(inflation.2$cpi)
>> inflation.2$cpi <- as.numeric(as.character(inflation.2$cpi))
>> *format(as.Date(inflation.2$period), "%Y-%m")*
>>
>> Having run the command lines above, the variable *period* in the attached
>> CSV file remains being read as a character variable. Kindly assist.
>>
>> Thank you.
>>
>>
>> Alternative email: addtar...@icloud.com/tchir...@rbz.co.zw
>> Skype: admirechirume
>> Call: +263773369884
>> whatsapp: +818099861504
>>
>>
>> On Thu, Sep 29, 2022 at 6:10 PM Jeff Newmiller 
>> wrote:
>>
>> > Your attachment was stripped by the mailing list. The criteria for
>> allowed
>> > attachments are a bit tricky to translate into actions to apply to your
>> > email software, so usually including part of your file in the body of
>> the
>> > email is the most successful approach for communicating your problem. Be
>> > sure to use a text editor or the
>> >
>> >   readLines("filename.csv") |> head() |> dput()
>> >
>> > functions in R to extract lines of your file for inclusion in the email.
>> >
>> > On September 29, 2022 8:52:30 AM PDT, Admire Tarisirayi Chirume <
>> > atchir...@gmail.com> wrote:
>> > >I kindly request for assistance to convert a Date variable from a
>> > character
>> > >to be recognised as a date. I used the following command lines.
>> > >
>> > >inflation<-read.csv("Inflation_forecasts_1.csv")
>> > >attach(inflation)
>> > >inflation[,1:2 ] #subsetting the dataframe
>> > >#Renaming variables
>> > >inflation<- rename(inflation.df,
>> > >   cpi = CPI,
>> > >   year=period)
>> > >
>> > >#subsetting data April 2020 to current
>> > >inflation.2<-data.frame(inflation[-c(1:135),])
>> > >class(inflation.2$cpi)
>> > >inflation.2$cpi <- as.numeric(as.character(inflation.2$cpi))
>> > >* format(as.Date(inflation.2$period), "%Y-%m")*
>> > >
>> > >Having ran the command lines above, the variable period in the attached
>> > csv
>> > >file remains being read as a character variable. Kindly assist.
>> > >
>> > >Thank you.
>> > >__
>> > >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> > >https://stat.ethz.ch/mailman/listinfo/r-help
>> > >PLEASE do read the posting guide
>> > http://www.R-project.org/posting-guide.html
>> > >and provide commented, minimal, self-contained, reproducible code.
>> >
>> > --
>> > Sent from my phone. Please excuse my brevity.
>> >
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see

Re: [R] [External] Fwd: Reading very large text files into R

2022-09-29 Thread Richard M. Heiberger
I think you need the
  fill=TRUE
argument. See 
?read.table

> On Sep 29, 2022, at 11:14, Enrico Schumann  wrote:
> 
> On Thu, 29 Sep 2022, Nick Wray writes:
> 
>> -- Forwarded message -
>> From: Nick Wray 
>> Date: Thu, 29 Sept 2022 at 15:32
>> Subject: Re: [R] Reading very large text files into R
>> To: Ben Tupper 
>> 
>> 
>> Hi Ben
>> Beneath is an example of the text (also in an attachment) and it's the "B",
>> of which there are quite a few scattered throughout the text doc which
>> causes the reading in error message (btw I don't need the "RAIN" column or
>> the 1's after it or the last four elements). I have also attached the
>> snippet as text file
>> 
>> 1980-01-01 10:00, 225620, RAIN, 1, 1, WAHRAIN, 5091, 1001, 0, , 9, 0, , ,
>> 1980-01-01 10:00, 226918, RAIN, 1, 1, WAHRAIN, 5124, 1001, 0, , 9, 0, , ,
>> 1980-01-01 10:00, 228562, RAIN, 1, 1, WAHRAIN, 491, 1001, 0, , 9, 0, , ,
>> 1980-01-01 10:00, 231581, RAIN, 1, 1, WAHRAIN, 5213, 1001, 0, , 9, 0, , ,
>> 1980-01-01 10:00, 232671, RAIN, 1, 1, WAHRAIN, 487, 1001, 0, , 9, 0, , ,
>> 1980-01-01 10:00, 232913, RAIN, 1, 1, WAHRAIN, 5243, 1001, 0, , 9, 0, , ,
>> 1980-01-01 10:00, 234362, RAIN, 1, 1, WAHRAIN, 5265, 1001, 0, , 10009, 0, ,
>> , B
>> 1980-01-01 10:00, 234682, RAIN, 1, 1, WAHRAIN, 5271, 1001, 0, , 9, 0, , ,
>> 1980-01-01 10:00, 235389, RAIN, 1, 1, WAHRAIN, 5279, 1001, 0, , 9, 0, , ,
>> 1980-01-01 10:00, 236466, RAIN, 1, 1, WAHRAIN, 497, 1001, 0, , 9, 0, , ,
>> 1980-01-01 10:00, 243350, RAIN, 1, 1, SREW, 484, 1001, 0, , 9, 0, , ,
>> 1980-01-01 10:00, 243350, RAIN, 1, 1, WAHRAIN, 484, 1001, 0, 0, 9, 9, , ,
>> 
>> Thanks Nick
>> 
>> On Thu, 29 Sept 2022 at 15:12, Ben Tupper  wrote:
>> 
>>> Hi Nick,
>>> 
>>> It's hard to know without seeing at least a snippet of the data.
>>> Could you do the following and paste the result into a plain text
>>> email? If you don't set your email client to plain text (from rich
>>> text or html) then we are apt to see a jumble of output on our email
>>> clients.
>>> 
>>> 
>>> ## start
>>> x <- readLines(filename, n = 20)
>>> cat(x, sep = "\n")
>>> ## end
>>> 
>>> Cheers,
>>> Ben
>>> 
>>> 
>>> On Thu, Sep 29, 2022 at 9:54 AM Nick Wray  wrote:
 
 Hello I may be offending the R purists with this question but it is
 linked to R, as will become clear. I have very large data sets from the
>>> UK
 Met Office in notepad form. Unfortunately, I can’t read them directly
 into R because, for some reason, although most lines in the text doc
 consist of 15 elements, every so often there is a sixteenth one and R
 doesn’t like this and gives me an error message because it has assumed
>>> that
 every line has 15 elements and doesn’t like finding one with more. I
>>> have
 tried playing around with the text document, inserting an extra element
 into the top line etc, but to no avail.
 
 Also unfortunately you need access permission from the Met Office to get
 the files in question so this link probably won’t work:
 
 https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcatalogue.ceda.ac.uk%2Fuuid%2Fbbd6916225e7475514e17fdbf11141c1data=05%7C01%7Crmh%40temple.edu%7C3c7f7571b0204227932408daa22d6a35%7C716e81efb52244738e3110bd02ccf6e5%7C0%7C0%7C638000614056886333%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=%2FolfWagLVB9RNAAR3L88YUnOG8wwDHZFPm5%2BWVWgZ7Y%3Dreserved=0
 
 So what I have done is simply to copy and paste the text docs into excel
 csv and then read them in, which is time-consuming but works. However
>>> the
 later datasets are over the excel limit of 1048576 lines. I can paste in
 the first 1048576 lines but then trying to isolate the remainder of the
 text doc to paste it into a second csv doc is proving v difficult – the
 only way I have found is to scroll down by hand and that’s taking ages.
>>> I
 cannot find another way of editing the notepad text doc to get rid of the
 part which I have already copied and pasted.
 
 Can anyone help with a)ideally being able to simply read the text tables
 into R or b)suggest a way of editing out the bits of the text file I
>>> have
 already pasted in without laborious scrolling?
 
 Thanks Nick Wray
 
> 
> [...]
> 
>>> 
>>> --
>>> Ben Tupper (he/him)
>>> Bigelow Laboratory for Ocean Science
>>> East Boothbay, Maine
>>> https://nam10.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.bigelow.org%2Fdata=05%7C01%7Crmh%40temple.edu%7C3c7f7571b0204227932408daa22d6a35%7C716e81efb52244738e3110bd02ccf6e5%7C0%7C0%7C638000614056886333%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=Qmpsx1aA7kL9lYJYshs1U7PrPqFpYFbzOQWXQvW1RLI%3Dreserved=0
>>> 

Re: [R] Converting a Date variable from character to Date

2022-09-29 Thread jim holtman
Try this by add a "day" to the date field

library(tidyverse)
library(lubridate)
input <- "*PeriodCPI*
2022m1 4994
2022m2 5336
2022m3 5671
2022m4 6532
2022m5 7973
2022m610365
2022m712673
2022m814356
2022m914708"

m_data <- read.delim(text = input, sep = "")

# convert the date by adding a "day" before the conversion

m_data$date <- ymd(paste0(m_data$X.Period, '-1'))
m_data

##   X.Period  CPI.   date
## 1   2022m1  4994 2022-01-01
## 2   2022m2  5336 2022-02-01
## 3   2022m3  5671 2022-03-01
## 4   2022m4  6532 2022-04-01
## 5   2022m5  7973 2022-05-01
## 6   2022m6 10365 2022-06-01
## 7   2022m7 12673 2022-07-01
## 8   2022m8 14356 2022-08-01
## 9   2022m9 14708 2022-09-01



Thanks

Jim Holtman
*Data Munger Guru*


*What is the problem that you are trying to solve?Tell me what you want to
do, not how you want to do it.*


On Thu, Sep 29, 2022 at 9:36 AM Admire Tarisirayi Chirume <
atchir...@gmail.com> wrote:

> Kindly request assistance to *convert a Date variable from a character to
> be recognized as a date*.
> NB: kindly take note that the data is in a csv file called *inflation*. I
> have included part of the file content herewith with the header for
> assistance.
>
>
> My data looks like this:
> *PeriodCPI*
> 2022m1 4994
> 2022m2 5336
> 2022m3 5671
> 2022m4 6532
> 2022m5 7973
> 2022m610365
> 2022m712673
> 2022m814356
> 2022m914708
>
>  I used the following command lines.
>
>
> class(inflation.2$cpi)
> inflation.2$cpi <- as.numeric(as.character(inflation.2$cpi))
> *format(as.Date(inflation.2$period), "%Y-%m")*
>
> Having run the command lines above, the variable *period* in the attached
> CSV file remains being read as a character variable. Kindly assist.
>
> Thank you.
>
>
> Alternative email: addtar...@icloud.com/tchir...@rbz.co.zw
> Skype: admirechirume
> Call: +263773369884
> whatsapp: +818099861504
>
>
> On Thu, Sep 29, 2022 at 6:10 PM Jeff Newmiller 
> wrote:
>
> > Your attachment was stripped by the mailing list. The criteria for
> allowed
> > attachments are a bit tricky to translate into actions to apply to your
> > email software, so usually including part of your file in the body of the
> > email is the most successful approach for communicating your problem. Be
> > sure to use a text editor or the
> >
> >   readLines("filename.csv") |> head() |> dput()
> >
> > functions in R to extract lines of your file for inclusion in the email.
> >
> > On September 29, 2022 8:52:30 AM PDT, Admire Tarisirayi Chirume <
> > atchir...@gmail.com> wrote:
> > >I kindly request for assistance to convert a Date variable from a
> > character
> > >to be recognised as a date. I used the following command lines.
> > >
> > >inflation<-read.csv("Inflation_forecasts_1.csv")
> > >attach(inflation)
> > >inflation[,1:2 ] #subsetting the dataframe
> > >#Renaming variables
> > >inflation<- rename(inflation.df,
> > >   cpi = CPI,
> > >   year=period)
> > >
> > >#subsetting data April 2020 to current
> > >inflation.2<-data.frame(inflation[-c(1:135),])
> > >class(inflation.2$cpi)
> > >inflation.2$cpi <- as.numeric(as.character(inflation.2$cpi))
> > >* format(as.Date(inflation.2$period), "%Y-%m")*
> > >
> > >Having ran the command lines above, the variable period in the attached
> > csv
> > >file remains being read as a character variable. Kindly assist.
> > >
> > >Thank you.
> > >__
> > >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > >https://stat.ethz.ch/mailman/listinfo/r-help
> > >PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > >and provide commented, minimal, self-contained, reproducible code.
> >
> > --
> > Sent from my phone. Please excuse my brevity.
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [R-pkgs] new version of package declared

2022-09-29 Thread Adrian Dușa
Dear R-list,

It gives me great pleasure to announce the release of version 0.18 of
package declared, that makes a difference between empty missing values (the
current NAs in R) and declared missing values (NAs with a reason).

Besides an automatic detection of such values by most base R operations
(including ==, != etc.), package declared also offers weighted versions of
most common summaries (ex. mean, median, sd, quantile) that also apply on
the declared missing values, something not covered by other similar
functions.

Version 0.18 is fully documented using Roxygen2 and extensively (even
obsessively) tested to achieve no less than 100% code coverage, using more
than 750 tests including the "golden" ones capturing output using
snapshots. It also features three Vignettes showcasing the package's
features.

Many thanks to Daniel Antal for the enthusiastic support and contribution
to the initial Roxygen2 and pkgdown documentation, bringing this package
closer to the rOpenSci standards. Most functions are now generic, allowing
extensions to any type of object, for both creation and coercion to class
"declared".

Platform specific binaries are soon to be built on CRAN, but they are also
available and can be installed from R-universe package page:
https://dusadrian.r-universe.dev/ui#package:declared

As always, bug reporting and feature proposals are more than welcome.
Adrian

[[alternative HTML version deleted]]

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Question about Line Ending Choice

2022-09-29 Thread Stephen H. Dawson, DSL via R-help

Awesome idea, Jorgen. Thanks for the input.

As expected, it was smart to ask about matter this before I undertook my 
build effort.



Kindest Regards,
*Stephen Dawson, DSL*
/Executive Strategy Consultant/
Business & Technology
+1 (865) 804-3454
http://www.shdawson.com


On 9/28/22 12:06, Jorgen Harmse via R-help wrote:

eol seems to be the parameter to use, but the answers so far appear to assume 
that the file is created on a Mac. For example, I think that �\r\n� on Windows 
would produce CR CR LF. I don�t have both systems handy (so I can�t test), but 
I think you should use raw to specify the bytes you want.

# I think the following are independent of the OS on which you are writing the 
file.
CR <- rawToChar(as.raw(13))
LF <- rawToChar(as.raw(10))
if missing(target)
   # Hope that it matches the machine on which you are writing the file.
   eol <- �\n�
else if (target==�Windows�)
   eol <- c(CR,LF)
else if (target %in% c(�Unix�,�Mac�))
   eol <- LF
else if �.
else
   stop(�Unexpected target.�)

write.table(eol=eol, �.)

Regards,
Jorgen Harmse.



Message: 7
Date: Tue, 27 Sep 2022 11:35:54 -0400
From: "Stephen H. Dawson, DSL" 
To: Bert Gunter 
Cc: r-help 
Subject: Re: [R] Question about Line Ending Choice
Message-ID: <04e458aa-e5f5-c932-da3c-1aa35db7d...@shdawson.com>
Content-Type: text/plain; charset="utf-8"; Format="flowed"

Hi Bert,


Thanks for the reply.

I did see the parameter, but was not sure if this is the correct
parameter to reference. I also see it in write.csv.

I take it you are saying the eol parameter is the best practice for
exporting from R using these functions. Am I correct or is there another
option other than write.csv and write.table I should be considering?


Thanks,
*Stephen Dawson, DSL*
/Executive Strategy Consultant/
Business & Technology
+1 (865) 804-3454
http://www.shdawson.com


On 9/27/22 11:29, Bert Gunter wrote:

Did you not see the "eol" parameter in write.table ?

Bert

On Tue, Sep 27, 2022 at 8:23 AM Stephen H. Dawson, DSL via R-help
 wrote:

 Hi All,


 I am writing with a question about choosing the line ending aspect
 of a
 file, please.

 I use write.csv and write.table to export work to CSV files and TXT
 files. I am planning now on how to share my work with the Windows
 crowd
 beyond only sharing with the Linux crowd. I use my text editor to
 flip
 the line ending option from Linux to Windows after exporting. This is
 inefficient for me to accomplish if I ramp up production as I expect
 will occur.

 Staying with the character encoding of UTF-8 seems fine for now from
 what I understand I need to deliver to my customers.

 What seems more efficient to me is to learn how to use R to define
 the
 line ending aspect of the exported file. I have not found if this
 is an
 option within R.

 QUESTION
 Is it possible within R to define the line ending aspect of file
 output?


 Kindest Regards,
 --
 *Stephen Dawson, DSL*
 /Executive Strategy Consultant/
 Business & Technology
 +1 (865) 804-3454
 http://www.shdawson.com

 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 
 and provide commented, minimal, self-contained, reproducible code.




**

[[alternative HTML version deleted]]


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Question about Line Ending Choice

2022-09-29 Thread Stephen H. Dawson, DSL via R-help

Hi Enrico,


You bring me the missing piece of my understanding to my conceptual 
planning and cost counting to avoid delivery costs being greater than 
acceptable.


Much appreciated.


*Stephen Dawson, DSL*
/Executive Strategy Consultant/
Business & Technology
+1 (865) 804-3454
http://www.shdawson.com


On 9/29/22 05:24, Enrico Schumann wrote:

On Tue, 27 Sep 2022, Stephen H. Dawson, DSL via R-help writes:


Hi All,


I am writing with a question about choosing the line
ending aspect of a file, please.

I use write.csv and write.table to export work to CSV
files and TXT files. I am planning now on how to share
my work with the Windows crowd beyond only sharing with
the Linux crowd. I use my text editor to flip the line
ending option from Linux to Windows after
exporting. This is inefficient for me to accomplish if
I ramp up production as I expect will occur.

Staying with the character encoding of UTF-8 seems fine
for now from what I understand I need to deliver to my
customers.

What seems more efficient to me is to learn how to use
R to define the line ending aspect of the exported
file. I have not found if this is an option within R.

QUESTION
Is it possible within R to define the line ending aspect of file output?


Kindest Regards,

Just a remark: there is a "standard" for CSV,
https://datatracker.ietf.org/doc/html/rfc4180.
It always requires CRLF as the line ending.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How long does it take to learn the R programming language?

2022-09-29 Thread avi.e.gross
Has anyone noticed something a tad unusual?

 

Someone shows up and seemingly politely asks a totally open-ended question and 
supplies NO DETTAILS about their personal status and experience that would be 
needed to tell hem whether it would take various amounts of time for him to 
learn enough R for whatever purposes.

 

Lots of people jump in and discuss it, and I choose this time to sit and wait 
and not point out the endless considerations others have nicely contributed.

 

What is missing is not a single polite reply from the original person 
acknowledging these efforts on his behalf, let alone ANSWERING some of the 
questions like whether he already has some experience with programming or what 
he wants to use R for.

 

As such, I am suspicious and won’t get involved with this and suggest others 
reconsider the need to keep discussing the topic unless it is for their own 
interest.

 

I have seen this many times now on multiple such boards. Either some people do 
not understand what is expected, or someone is trolling and just looking to get 
a reaction.

 

I prefer to deal with more focused questions if someone is asking for help such 
as what package might help them do a somewhat specific task or why they are 
getting an error message. A general question like whether R or Python or 
something else is better for a particular task might also be reasonable. But 
how long it takes to learn ANYTHING seems to be a very subjective question, let 
alone something as multi-faceted as a programming language that can be used for 
so many different things.

 

Just my two cents.

 

I will say it did not take me long to learn a decent amount of R and yet I keep 
learning and am very far from knowing a fraction of all there is to know and 
especially not things I have had no reason to know yet.

 

From: jim holtman  
Sent: Thursday, September 29, 2022 12:28 PM
To: Ebert,Timothy Aaron 
Cc: Avi Gross ; John Kane ; R. 
Mailing List 
Subject: Re: [R] How long does it take to learn the R programming language?

 

Still at it after 38 years.  First came across S at Bell Labs in 1984.

 

Thanks


Jim Holtman
Data Munger Guru
 
What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

 

 

On Thu, Sep 29, 2022 at 7:09 AM Ebert,Timothy Aaron mailto:teb...@ufl.edu> > wrote:

Learning R takes an hour. Find an hourglass, flip it over. Meanwhile we will 
start increasing the size of the upper chamber and adding more sand. 

Mastery of R is an asymptotic function of time. 

While such answers might indicate trying for mastery is futile, you can learn 
enough R to be very useful long before "mastery."

Tim
-Original Message-
From: R-help mailto:r-help-boun...@r-project.org> > On Behalf Of Avi Gross
Sent: Wednesday, September 28, 2022 5:51 PM
To: John Kane mailto:jrkrid...@gmail.com> >
Cc: R. Mailing List mailto:r-help@r-project.org> >
Subject: Re: [R] How long does it take to learn the R programming language?

[External Email]

So is the proper R answer simply Inf?

On Wed, Sep 28, 2022, 5:39 PM John Kane mailto:jrkrid...@gmail.com> > wrote:

> + 1
>
> On Wed, 28 Sept 2022 at 17:36, Jim Lemon   > wrote:
>
> > Given some of the questions that are posted to this list, I am not 
> > sure that there is an upper bound to the estimate.
> >
> > Jim
> >
> > __
> > R-help@r-project.org   mailing list -- To 
> > UNSUBSCRIBE and more, see
> > https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fst
> > at.ethz.ch  
> > %2Fmailman%2Flistinfo%2Fr-helpdata=05%7C01%7Ctebert%4
> > 0ufl.edu  
> > %7C7229f6c17d764bd2742c08daa19bb65b%7C0d4da0f84a314d76ace60a
> > 62331e1b84%7C0%7C0%7C63787396320713%7CUnknown%7CTWFpbGZsb3d8eyJW
> > IjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C300
> > 0%7C%7C%7Csdata=8KNANsIMtWiElOAwn9pXvx%2BsueyNn329VkvFFx8Paew%3
> > Dreserved=0
> > PLEASE do read the posting guide
> > https://nam10.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww
> > .r-project.org  
> > %2Fposting-guide.htmldata=05%7C01%7Ctebert%40ufl.
> > edu%7C7229f6c17d764bd2742c08daa19bb65b%7C0d4da0f84a314d76ace60a62331
> > e1b84%7C0%7C0%7C63787396320713%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiM
> > C4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%
> > 7C%7Csdata=32nVjz3UeC4QK7dd2PHA76BywkYQP9ucuN%2FWFFAUX8k%3D
> > ;reserved=0 and provide commented, minimal, self-contained, 
> > reproducible code.
> >
>
>
> --
> John Kane
> Kingston ON Canada
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org   mailing list -- To 
> UNSUBSCRIBE and more, see
> https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat
> .ethz.ch  
> 

[R] Converting a Date variable from character to Date

2022-09-29 Thread Admire Tarisirayi Chirume
Kindly request assistance to *convert a Date variable from a character to
be recognized as a date*.
NB: kindly take note that the data is in a csv file called *inflation*. I
have included part of the file content herewith with the header for
assistance.


My data looks like this:
*PeriodCPI*
2022m1 4994
2022m2 5336
2022m3 5671
2022m4 6532
2022m5 7973
2022m610365
2022m712673
2022m814356
2022m914708

 I used the following command lines.


class(inflation.2$cpi)
inflation.2$cpi <- as.numeric(as.character(inflation.2$cpi))
*format(as.Date(inflation.2$period), "%Y-%m")*

Having run the command lines above, the variable *period* in the attached
CSV file remains being read as a character variable. Kindly assist.

Thank you.


Alternative email: addtar...@icloud.com/tchir...@rbz.co.zw
Skype: admirechirume
Call: +263773369884
whatsapp: +818099861504


On Thu, Sep 29, 2022 at 6:10 PM Jeff Newmiller 
wrote:

> Your attachment was stripped by the mailing list. The criteria for allowed
> attachments are a bit tricky to translate into actions to apply to your
> email software, so usually including part of your file in the body of the
> email is the most successful approach for communicating your problem. Be
> sure to use a text editor or the
>
>   readLines("filename.csv") |> head() |> dput()
>
> functions in R to extract lines of your file for inclusion in the email.
>
> On September 29, 2022 8:52:30 AM PDT, Admire Tarisirayi Chirume <
> atchir...@gmail.com> wrote:
> >I kindly request for assistance to convert a Date variable from a
> character
> >to be recognised as a date. I used the following command lines.
> >
> >inflation<-read.csv("Inflation_forecasts_1.csv")
> >attach(inflation)
> >inflation[,1:2 ] #subsetting the dataframe
> >#Renaming variables
> >inflation<- rename(inflation.df,
> >   cpi = CPI,
> >   year=period)
> >
> >#subsetting data April 2020 to current
> >inflation.2<-data.frame(inflation[-c(1:135),])
> >class(inflation.2$cpi)
> >inflation.2$cpi <- as.numeric(as.character(inflation.2$cpi))
> >* format(as.Date(inflation.2$period), "%Y-%m")*
> >
> >Having ran the command lines above, the variable period in the attached
> csv
> >file remains being read as a character variable. Kindly assist.
> >
> >Thank you.
> >__
> >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
>
> --
> Sent from my phone. Please excuse my brevity.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How long does it take to learn the R programming language?

2022-09-29 Thread jim holtman
Still at it after 38 years.  First came across S at Bell Labs in 1984.

Thanks

Jim Holtman
*Data Munger Guru*


*What is the problem that you are trying to solve?Tell me what you want to
do, not how you want to do it.*


On Thu, Sep 29, 2022 at 7:09 AM Ebert,Timothy Aaron  wrote:

> Learning R takes an hour. Find an hourglass, flip it over. Meanwhile we
> will start increasing the size of the upper chamber and adding more sand.
>
> Mastery of R is an asymptotic function of time.
>
> While such answers might indicate trying for mastery is futile, you can
> learn enough R to be very useful long before "mastery."
>
> Tim
> -Original Message-
> From: R-help  On Behalf Of Avi Gross
> Sent: Wednesday, September 28, 2022 5:51 PM
> To: John Kane 
> Cc: R. Mailing List 
> Subject: Re: [R] How long does it take to learn the R programming language?
>
> [External Email]
>
> So is the proper R answer simply Inf?
>
> On Wed, Sep 28, 2022, 5:39 PM John Kane  wrote:
>
> > + 1
> >
> > On Wed, 28 Sept 2022 at 17:36, Jim Lemon  wrote:
> >
> > > Given some of the questions that are posted to this list, I am not
> > > sure that there is an upper bound to the estimate.
> > >
> > > Jim
> > >
> > > __
> > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fst
> > > at.ethz.ch%2Fmailman%2Flistinfo%2Fr-helpdata=05%7C01%7Ctebert%4
> > > 0ufl.edu%7C7229f6c17d764bd2742c08daa19bb65b%7C0d4da0f84a314d76ace60a
> > > 62331e1b84%7C0%7C0%7C63787396320713%7CUnknown%7CTWFpbGZsb3d8eyJW
> > > IjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C300
> > > 0%7C%7C%7Csdata=8KNANsIMtWiElOAwn9pXvx%2BsueyNn329VkvFFx8Paew%3
> > > Dreserved=0
> > > PLEASE do read the posting guide
> > > https://nam10.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww
> > > .r-project.org%2Fposting-guide.htmldata=05%7C01%7Ctebert%40ufl.
> > > edu%7C7229f6c17d764bd2742c08daa19bb65b%7C0d4da0f84a314d76ace60a62331
> > > e1b84%7C0%7C0%7C63787396320713%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiM
> > > C4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%
> > > 7C%7Csdata=32nVjz3UeC4QK7dd2PHA76BywkYQP9ucuN%2FWFFAUX8k%3D
> > > ;reserved=0 and provide commented, minimal, self-contained,
> > > reproducible code.
> > >
> >
> >
> > --
> > John Kane
> > Kingston ON Canada
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat
> > .ethz.ch%2Fmailman%2Flistinfo%2Fr-helpdata=05%7C01%7Ctebert%40ufl
> > .edu%7C7229f6c17d764bd2742c08daa19bb65b%7C0d4da0f84a314d76ace60a62331e
> > 1b84%7C0%7C0%7C63787396320713%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4w
> > LjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C
> > sdata=8KNANsIMtWiElOAwn9pXvx%2BsueyNn329VkvFFx8Paew%3Dreserv
> > ed=0
> > PLEASE do read the posting guide
> > https://nam10.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r
> > -project.org%2Fposting-guide.htmldata=05%7C01%7Ctebert%40ufl.edu%
> > 7C7229f6c17d764bd2742c08daa19bb65b%7C0d4da0f84a314d76ace60a62331e1b84%
> > 7C0%7C0%7C63787396320713%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwM
> > DAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C
> > sdata=32nVjz3UeC4QK7dd2PHA76BywkYQP9ucuN%2FWFFAUX8k%3Dreserved=0
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>
> https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-helpdata=05%7C01%7Ctebert%40ufl.edu%7C7229f6c17d764bd2742c08daa19bb65b%7C0d4da0f84a314d76ace60a62331e1b84%7C0%7C0%7C63787396320713%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=8KNANsIMtWiElOAwn9pXvx%2BsueyNn329VkvFFx8Paew%3Dreserved=0
> PLEASE do read the posting guide
> https://nam10.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r-project.org%2Fposting-guide.htmldata=05%7C01%7Ctebert%40ufl.edu%7C7229f6c17d764bd2742c08daa19bb65b%7C0d4da0f84a314d76ace60a62331e1b84%7C0%7C0%7C63787396320713%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=32nVjz3UeC4QK7dd2PHA76BywkYQP9ucuN%2FWFFAUX8k%3Dreserved=0
> and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


Re: [R] Covwerting a Date variable from character to Date

2022-09-29 Thread Jeff Newmiller
Your attachment was stripped by the mailing list. The criteria for allowed 
attachments are a bit tricky to translate into actions to apply to your email 
software, so usually including part of your file in the body of the email is 
the most successful approach for communicating your problem. Be sure to use a 
text editor or the

  readLines("filename.csv") |> head() |> dput()

functions in R to extract lines of your file for inclusion in the email.

On September 29, 2022 8:52:30 AM PDT, Admire Tarisirayi Chirume 
 wrote:
>I kindly request for assistance to convert a Date variable from a character
>to be recognised as a date. I used the following command lines.
>
>inflation<-read.csv("Inflation_forecasts_1.csv")
>attach(inflation)
>inflation[,1:2 ] #subsetting the dataframe
>#Renaming variables
>inflation<- rename(inflation.df,
>   cpi = CPI,
>   year=period)
>
>#subsetting data April 2020 to current
>inflation.2<-data.frame(inflation[-c(1:135),])
>class(inflation.2$cpi)
>inflation.2$cpi <- as.numeric(as.character(inflation.2$cpi))
>* format(as.Date(inflation.2$period), "%Y-%m")*
>
>Having ran the command lines above, the variable period in the attached csv
>file remains being read as a character variable. Kindly assist.
>
>Thank you.
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

-- 
Sent from my phone. Please excuse my brevity.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Covwerting a Date variable from character to Date

2022-09-29 Thread Admire Tarisirayi Chirume
I kindly request for assistance to convert a Date variable from a character
to be recognised as a date. I used the following command lines.

inflation<-read.csv("Inflation_forecasts_1.csv")
attach(inflation)
inflation[,1:2 ] #subsetting the dataframe
#Renaming variables
inflation<- rename(inflation.df,
   cpi = CPI,
   year=period)

#subsetting data April 2020 to current
inflation.2<-data.frame(inflation[-c(1:135),])
class(inflation.2$cpi)
inflation.2$cpi <- as.numeric(as.character(inflation.2$cpi))
* format(as.Date(inflation.2$period), "%Y-%m")*

Having ran the command lines above, the variable period in the attached csv
file remains being read as a character variable. Kindly assist.

Thank you.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reading very large text files into R

2022-09-29 Thread Nick Wray
Hello Ivan's suggestion of fill=T seems to do the trick.  Thanks to
everyone who piled in - I'm rather touched by the support seeing as this
was causing me a big headache with furthering my project.  I also feel
humbled by realising how little I know about the R-universe... Nick

On Thu, 29 Sept 2022 at 15:09, Ivan Krylov  wrote:

> В Thu, 29 Sep 2022 14:54:10 +0100
> Nick Wray  пишет:
>
> > although most lines in the text doc consist of 15 elements, every so
> > often there is a sixteenth one and R doesn’t like this and gives me
> > an error message
>
> Does the fill = TRUE argument of read.table() help?
>
> If not, could you construct and share a small file with the same kind
> of problem (16th field) but without the data one has to apply for
> access to? (E.g. cut out a few lines from the original file, then
> replace all digits.)
>
> --
> Best regards,
> Ivan
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reading very large text files into R

2022-09-29 Thread Jeff Newmiller
"Confusion" is the size of the file. Try specifying the colClasses argument to 
nail down the number and type of the columns.

On September 29, 2022 8:16:34 AM PDT, Bert Gunter  
wrote:
>I had no trouble reading your text snippet with
>read.csv(text =
>"... your text... ")
>
>There were 15 columns. The last column was all empty except for the row
>containing the "B".
>
>So there seems to be some confusion here.
>
>-- Bert
>
>
>
>
>
>
>On Thu, Sep 29, 2022 at 6:54 AM Nick Wray  wrote:
>
>> Hello   I may be offending the R purists with this question but it is
>> linked to R, as will become clear.  I have very large data sets from the UK
>> Met Office in notepad form.  Unfortunately,  I can’t read them directly
>> into R because, for some reason, although most lines in the text doc
>> consist of 15 elements, every so often there is a sixteenth one and R
>> doesn’t like this and gives me an error message because it has assumed that
>> every line has 15 elements and doesn’t like finding one with more.  I have
>> tried playing around with the text document, inserting an extra element
>> into the top line etc, but to no avail.
>>
>> Also unfortunately you need access permission from the Met Office to get
>> the files in question so this link probably won’t work:
>>
>> https://catalogue.ceda.ac.uk/uuid/bbd6916225e7475514e17fdbf11141c1
>>
>> So what I have done is simply to copy and paste the text docs into excel
>> csv and then read them in, which is time-consuming but works.  However the
>> later datasets are over the excel limit of 1048576 lines.  I can paste in
>> the first 1048576 lines but then trying to isolate the remainder of the
>> text doc to paste it into a second csv doc is proving v difficult – the
>> only way I have found is to scroll down by hand and that’s taking ages.  I
>> cannot find another way of editing the notepad text doc to get rid of the
>> part which I have already copied and pasted.
>>
>> Can anyone help with a)ideally being able to simply read the text tables
>> into R  or b)suggest a way of editing out the bits of the text file I have
>> already pasted in without laborious scrolling?
>>
>> Thanks Nick Wray
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>   [[alternative HTML version deleted]]
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

-- 
Sent from my phone. Please excuse my brevity.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reading very large text files into R

2022-09-29 Thread Nick Wray
Hi Bert   Right Thing is, I didn't know that there even was an instruction
like read.csv(text =
"... your text... ")  so at any rate I can paste the original text files in
by hand if there's no shorter cut
Thanks v much Nick

On Thu, 29 Sept 2022 at 16:16, Bert Gunter  wrote:

> I had no trouble reading your text snippet with
> read.csv(text =
> "... your text... ")
>
> There were 15 columns. The last column was all empty except for the row
> containing the "B".
>
> So there seems to be some confusion here.
>
> -- Bert
>
>
>
>
>
>
> On Thu, Sep 29, 2022 at 6:54 AM Nick Wray  wrote:
>
>> Hello   I may be offending the R purists with this question but it is
>> linked to R, as will become clear.  I have very large data sets from the
>> UK
>> Met Office in notepad form.  Unfortunately,  I can’t read them directly
>> into R because, for some reason, although most lines in the text doc
>> consist of 15 elements, every so often there is a sixteenth one and R
>> doesn’t like this and gives me an error message because it has assumed
>> that
>> every line has 15 elements and doesn’t like finding one with more.  I have
>> tried playing around with the text document, inserting an extra element
>> into the top line etc, but to no avail.
>>
>> Also unfortunately you need access permission from the Met Office to get
>> the files in question so this link probably won’t work:
>>
>> https://catalogue.ceda.ac.uk/uuid/bbd6916225e7475514e17fdbf11141c1
>>
>> So what I have done is simply to copy and paste the text docs into excel
>> csv and then read them in, which is time-consuming but works.  However the
>> later datasets are over the excel limit of 1048576 lines.  I can paste in
>> the first 1048576 lines but then trying to isolate the remainder of the
>> text doc to paste it into a second csv doc is proving v difficult – the
>> only way I have found is to scroll down by hand and that’s taking ages.  I
>> cannot find another way of editing the notepad text doc to get rid of the
>> part which I have already copied and pasted.
>>
>> Can anyone help with a)ideally being able to simply read the text tables
>> into R  or b)suggest a way of editing out the bits of the text file I have
>> already pasted in without laborious scrolling?
>>
>> Thanks Nick Wray
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reading very large text files into R

2022-09-29 Thread Bert Gunter
I had no trouble reading your text snippet with
read.csv(text =
"... your text... ")

There were 15 columns. The last column was all empty except for the row
containing the "B".

So there seems to be some confusion here.

-- Bert






On Thu, Sep 29, 2022 at 6:54 AM Nick Wray  wrote:

> Hello   I may be offending the R purists with this question but it is
> linked to R, as will become clear.  I have very large data sets from the UK
> Met Office in notepad form.  Unfortunately,  I can’t read them directly
> into R because, for some reason, although most lines in the text doc
> consist of 15 elements, every so often there is a sixteenth one and R
> doesn’t like this and gives me an error message because it has assumed that
> every line has 15 elements and doesn’t like finding one with more.  I have
> tried playing around with the text document, inserting an extra element
> into the top line etc, but to no avail.
>
> Also unfortunately you need access permission from the Met Office to get
> the files in question so this link probably won’t work:
>
> https://catalogue.ceda.ac.uk/uuid/bbd6916225e7475514e17fdbf11141c1
>
> So what I have done is simply to copy and paste the text docs into excel
> csv and then read them in, which is time-consuming but works.  However the
> later datasets are over the excel limit of 1048576 lines.  I can paste in
> the first 1048576 lines but then trying to isolate the remainder of the
> text doc to paste it into a second csv doc is proving v difficult – the
> only way I have found is to scroll down by hand and that’s taking ages.  I
> cannot find another way of editing the notepad text doc to get rid of the
> part which I have already copied and pasted.
>
> Can anyone help with a)ideally being able to simply read the text tables
> into R  or b)suggest a way of editing out the bits of the text file I have
> already pasted in without laborious scrolling?
>
> Thanks Nick Wray
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fwd: Reading very large text files into R

2022-09-29 Thread Enrico Schumann
On Thu, 29 Sep 2022, Nick Wray writes:

> -- Forwarded message -
> From: Nick Wray 
> Date: Thu, 29 Sept 2022 at 15:32
> Subject: Re: [R] Reading very large text files into R
> To: Ben Tupper 
>
>
> Hi Ben
> Beneath is an example of the text (also in an attachment) and it's the "B",
> of which there are quite a few scattered throughout the text doc which
> causes the reading in error message (btw I don't need the "RAIN" column or
> the 1's after it or the last four elements).   I have also attached the
> snippet as text file
>
> 1980-01-01 10:00, 225620, RAIN, 1, 1, WAHRAIN, 5091, 1001, 0, , 9, 0, , ,
> 1980-01-01 10:00, 226918, RAIN, 1, 1, WAHRAIN, 5124, 1001, 0, , 9, 0, , ,
> 1980-01-01 10:00, 228562, RAIN, 1, 1, WAHRAIN, 491, 1001, 0, , 9, 0, , ,
> 1980-01-01 10:00, 231581, RAIN, 1, 1, WAHRAIN, 5213, 1001, 0, , 9, 0, , ,
> 1980-01-01 10:00, 232671, RAIN, 1, 1, WAHRAIN, 487, 1001, 0, , 9, 0, , ,
> 1980-01-01 10:00, 232913, RAIN, 1, 1, WAHRAIN, 5243, 1001, 0, , 9, 0, , ,
> 1980-01-01 10:00, 234362, RAIN, 1, 1, WAHRAIN, 5265, 1001, 0, , 10009, 0, ,
> , B
> 1980-01-01 10:00, 234682, RAIN, 1, 1, WAHRAIN, 5271, 1001, 0, , 9, 0, , ,
> 1980-01-01 10:00, 235389, RAIN, 1, 1, WAHRAIN, 5279, 1001, 0, , 9, 0, , ,
> 1980-01-01 10:00, 236466, RAIN, 1, 1, WAHRAIN, 497, 1001, 0, , 9, 0, , ,
> 1980-01-01 10:00, 243350, RAIN, 1, 1, SREW, 484, 1001, 0, , 9, 0, , ,
> 1980-01-01 10:00, 243350, RAIN, 1, 1, WAHRAIN, 484, 1001, 0, 0, 9, 9, , ,
>
> Thanks Nick
>
> On Thu, 29 Sept 2022 at 15:12, Ben Tupper  wrote:
>
>> Hi Nick,
>>
>> It's hard to know without seeing at least a snippet of the data.
>> Could you do the following and paste the result into a plain text
>> email?  If you don't set your email client to plain text (from rich
>> text or html) then we are apt to see a jumble of output on our email
>> clients.
>>
>>
>> ## start
>> x <- readLines(filename, n = 20)
>> cat(x, sep = "\n")
>> ## end
>>
>> Cheers,
>> Ben
>>
>>
>> On Thu, Sep 29, 2022 at 9:54 AM Nick Wray  wrote:
>> >
>> > Hello   I may be offending the R purists with this question but it is
>> > linked to R, as will become clear.  I have very large data sets from the
>> UK
>> > Met Office in notepad form.  Unfortunately,  I can’t read them directly
>> > into R because, for some reason, although most lines in the text doc
>> > consist of 15 elements, every so often there is a sixteenth one and R
>> > doesn’t like this and gives me an error message because it has assumed
>> that
>> > every line has 15 elements and doesn’t like finding one with more.  I
>> have
>> > tried playing around with the text document, inserting an extra element
>> > into the top line etc, but to no avail.
>> >
>> > Also unfortunately you need access permission from the Met Office to get
>> > the files in question so this link probably won’t work:
>> >
>> > https://catalogue.ceda.ac.uk/uuid/bbd6916225e7475514e17fdbf11141c1
>> >
>> > So what I have done is simply to copy and paste the text docs into excel
>> > csv and then read them in, which is time-consuming but works.  However
>> the
>> > later datasets are over the excel limit of 1048576 lines.  I can paste in
>> > the first 1048576 lines but then trying to isolate the remainder of the
>> > text doc to paste it into a second csv doc is proving v difficult – the
>> > only way I have found is to scroll down by hand and that’s taking ages.
>> I
>> > cannot find another way of editing the notepad text doc to get rid of the
>> > part which I have already copied and pasted.
>> >
>> > Can anyone help with a)ideally being able to simply read the text tables
>> > into R  or b)suggest a way of editing out the bits of the text file I
>> have
>> > already pasted in without laborious scrolling?
>> >
>> > Thanks Nick Wray
>> >

[...]

>>
>> --
>> Ben Tupper (he/him)
>> Bigelow Laboratory for Ocean Science
>> East Boothbay, Maine
>> http://www.bigelow.org/
>> https://eco.bigelow.org
>>
>

Maybe I have missed it, but could you please show how
you tried to read the table?

When I use your file with 

read.table("sample text.txt", header = FALSE, sep = ",")

I get

##  V1 V2V3 V4 V5   V6   V7   V8 V9 V10   V11 
V12 V13 V14 V15
## 1  1980-01-01 10:00 225620  RAIN  1  1  WAHRAIN 5091 1001  0  NA 9   
0  NA  NA
## 2  1980-01-01 10:00 226918  RAIN  1  1  WAHRAIN 5124 1001  0  NA 9   
0  NA  NA
## ## .
## 7  1980-01-01 10:00 234362  RAIN  1  1  WAHRAIN 5265 1001  0  NA 10009   
0  NA  NA   B
## 8  1980-01-01 10:00 234682  RAIN  1  1  WAHRAIN 5271 1001  0  NA 9   
0  NA  NA



-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Fwd: Reading very large text files into R

2022-09-29 Thread Nick Wray
-- Forwarded message -
From: Nick Wray 
Date: Thu, 29 Sept 2022 at 15:32
Subject: Re: [R] Reading very large text files into R
To: Ben Tupper 


Hi Ben
Beneath is an example of the text (also in an attachment) and it's the "B",
of which there are quite a few scattered throughout the text doc which
causes the reading in error message (btw I don't need the "RAIN" column or
the 1's after it or the last four elements).   I have also attached the
snippet as text file

1980-01-01 10:00, 225620, RAIN, 1, 1, WAHRAIN, 5091, 1001, 0, , 9, 0, , ,
1980-01-01 10:00, 226918, RAIN, 1, 1, WAHRAIN, 5124, 1001, 0, , 9, 0, , ,
1980-01-01 10:00, 228562, RAIN, 1, 1, WAHRAIN, 491, 1001, 0, , 9, 0, , ,
1980-01-01 10:00, 231581, RAIN, 1, 1, WAHRAIN, 5213, 1001, 0, , 9, 0, , ,
1980-01-01 10:00, 232671, RAIN, 1, 1, WAHRAIN, 487, 1001, 0, , 9, 0, , ,
1980-01-01 10:00, 232913, RAIN, 1, 1, WAHRAIN, 5243, 1001, 0, , 9, 0, , ,
1980-01-01 10:00, 234362, RAIN, 1, 1, WAHRAIN, 5265, 1001, 0, , 10009, 0, ,
, B
1980-01-01 10:00, 234682, RAIN, 1, 1, WAHRAIN, 5271, 1001, 0, , 9, 0, , ,
1980-01-01 10:00, 235389, RAIN, 1, 1, WAHRAIN, 5279, 1001, 0, , 9, 0, , ,
1980-01-01 10:00, 236466, RAIN, 1, 1, WAHRAIN, 497, 1001, 0, , 9, 0, , ,
1980-01-01 10:00, 243350, RAIN, 1, 1, SREW, 484, 1001, 0, , 9, 0, , ,
1980-01-01 10:00, 243350, RAIN, 1, 1, WAHRAIN, 484, 1001, 0, 0, 9, 9, , ,

Thanks Nick

On Thu, 29 Sept 2022 at 15:12, Ben Tupper  wrote:

> Hi Nick,
>
> It's hard to know without seeing at least a snippet of the data.
> Could you do the following and paste the result into a plain text
> email?  If you don't set your email client to plain text (from rich
> text or html) then we are apt to see a jumble of output on our email
> clients.
>
>
> ## start
> x <- readLines(filename, n = 20)
> cat(x, sep = "\n")
> ## end
>
> Cheers,
> Ben
>
>
> On Thu, Sep 29, 2022 at 9:54 AM Nick Wray  wrote:
> >
> > Hello   I may be offending the R purists with this question but it is
> > linked to R, as will become clear.  I have very large data sets from the
> UK
> > Met Office in notepad form.  Unfortunately,  I can’t read them directly
> > into R because, for some reason, although most lines in the text doc
> > consist of 15 elements, every so often there is a sixteenth one and R
> > doesn’t like this and gives me an error message because it has assumed
> that
> > every line has 15 elements and doesn’t like finding one with more.  I
> have
> > tried playing around with the text document, inserting an extra element
> > into the top line etc, but to no avail.
> >
> > Also unfortunately you need access permission from the Met Office to get
> > the files in question so this link probably won’t work:
> >
> > https://catalogue.ceda.ac.uk/uuid/bbd6916225e7475514e17fdbf11141c1
> >
> > So what I have done is simply to copy and paste the text docs into excel
> > csv and then read them in, which is time-consuming but works.  However
> the
> > later datasets are over the excel limit of 1048576 lines.  I can paste in
> > the first 1048576 lines but then trying to isolate the remainder of the
> > text doc to paste it into a second csv doc is proving v difficult – the
> > only way I have found is to scroll down by hand and that’s taking ages.
> I
> > cannot find another way of editing the notepad text doc to get rid of the
> > part which I have already copied and pasted.
> >
> > Can anyone help with a)ideally being able to simply read the text tables
> > into R  or b)suggest a way of editing out the bits of the text file I
> have
> > already pasted in without laborious scrolling?
> >
> > Thanks Nick Wray
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>
>
> --
> Ben Tupper (he/him)
> Bigelow Laboratory for Ocean Science
> East Boothbay, Maine
> http://www.bigelow.org/
> https://eco.bigelow.org
>
1980-01-01 10:00, 225620, RAIN, 1, 1, WAHRAIN, 5091, 1001, 0, , 9, 0, , ,
1980-01-01 10:00, 226918, RAIN, 1, 1, WAHRAIN, 5124, 1001, 0, , 9, 0, , ,
1980-01-01 10:00, 228562, RAIN, 1, 1, WAHRAIN, 491, 1001, 0, , 9, 0, , ,
1980-01-01 10:00, 231581, RAIN, 1, 1, WAHRAIN, 5213, 1001, 0, , 9, 0, , ,
1980-01-01 10:00, 232671, RAIN, 1, 1, WAHRAIN, 487, 1001, 0, , 9, 0, , ,
1980-01-01 10:00, 232913, RAIN, 1, 1, WAHRAIN, 5243, 1001, 0, , 9, 0, , ,
1980-01-01 10:00, 234362, RAIN, 1, 1, WAHRAIN, 5265, 1001, 0, , 10009, 0, , , B
1980-01-01 10:00, 234682, RAIN, 1, 1, WAHRAIN, 5271, 1001, 0, , 9, 0, , ,
1980-01-01 10:00, 235389, RAIN, 1, 1, WAHRAIN, 5279, 1001, 0, , 9, 0, , ,
1980-01-01 10:00, 236466, RAIN, 1, 1, WAHRAIN, 497, 1001, 0, , 9, 0, , ,
1980-01-01 10:00, 243350, RAIN, 1, 1, SREW, 484, 1001, 0, , 9, 0, , ,
1980-01-01 10:00, 243350, RAIN, 1, 1, WAHRAIN, 

Re: [R] Reading very large text files into R

2022-09-29 Thread Jan van der Laan
You're sure the extra column is indeed an extra column? According to the 
documentation 
(https://artefacts.ceda.ac.uk/badc_datadocs/ukmo-midas/RH_Table.html) 
there should be 15 columns.


Could it, for example, be that one of the columns contains records with 
commas?


Jan



On 29-09-2022 15:54, Nick Wray wrote:

Hello   I may be offending the R purists with this question but it is
linked to R, as will become clear.  I have very large data sets from the UK
Met Office in notepad form.  Unfortunately,  I can’t read them directly
into R because, for some reason, although most lines in the text doc
consist of 15 elements, every so often there is a sixteenth one and R
doesn’t like this and gives me an error message because it has assumed that
every line has 15 elements and doesn’t like finding one with more.  I have
tried playing around with the text document, inserting an extra element
into the top line etc, but to no avail.

Also unfortunately you need access permission from the Met Office to get
the files in question so this link probably won’t work:

https://catalogue.ceda.ac.uk/uuid/bbd6916225e7475514e17fdbf11141c1

So what I have done is simply to copy and paste the text docs into excel
csv and then read them in, which is time-consuming but works.  However the
later datasets are over the excel limit of 1048576 lines.  I can paste in
the first 1048576 lines but then trying to isolate the remainder of the
text doc to paste it into a second csv doc is proving v difficult – the
only way I have found is to scroll down by hand and that’s taking ages.  I
cannot find another way of editing the notepad text doc to get rid of the
part which I have already copied and pasted.

Can anyone help with a)ideally being able to simply read the text tables
into R  or b)suggest a way of editing out the bits of the text file I have
already pasted in without laborious scrolling?

Thanks Nick Wray

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reading very large text files into R

2022-09-29 Thread Ben Tupper
Hi Nick,

It's hard to know without seeing at least a snippet of the data.
Could you do the following and paste the result into a plain text
email?  If you don't set your email client to plain text (from rich
text or html) then we are apt to see a jumble of output on our email
clients.


## start
x <- readLines(filename, n = 20)
cat(x, sep = "\n")
## end

Cheers,
Ben


On Thu, Sep 29, 2022 at 9:54 AM Nick Wray  wrote:
>
> Hello   I may be offending the R purists with this question but it is
> linked to R, as will become clear.  I have very large data sets from the UK
> Met Office in notepad form.  Unfortunately,  I can’t read them directly
> into R because, for some reason, although most lines in the text doc
> consist of 15 elements, every so often there is a sixteenth one and R
> doesn’t like this and gives me an error message because it has assumed that
> every line has 15 elements and doesn’t like finding one with more.  I have
> tried playing around with the text document, inserting an extra element
> into the top line etc, but to no avail.
>
> Also unfortunately you need access permission from the Met Office to get
> the files in question so this link probably won’t work:
>
> https://catalogue.ceda.ac.uk/uuid/bbd6916225e7475514e17fdbf11141c1
>
> So what I have done is simply to copy and paste the text docs into excel
> csv and then read them in, which is time-consuming but works.  However the
> later datasets are over the excel limit of 1048576 lines.  I can paste in
> the first 1048576 lines but then trying to isolate the remainder of the
> text doc to paste it into a second csv doc is proving v difficult – the
> only way I have found is to scroll down by hand and that’s taking ages.  I
> cannot find another way of editing the notepad text doc to get rid of the
> part which I have already copied and pasted.
>
> Can anyone help with a)ideally being able to simply read the text tables
> into R  or b)suggest a way of editing out the bits of the text file I have
> already pasted in without laborious scrolling?
>
> Thanks Nick Wray
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Ben Tupper (he/him)
Bigelow Laboratory for Ocean Science
East Boothbay, Maine
http://www.bigelow.org/
https://eco.bigelow.org

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reading very large text files into R

2022-09-29 Thread Ivan Krylov
В Thu, 29 Sep 2022 14:54:10 +0100
Nick Wray  пишет:

> although most lines in the text doc consist of 15 elements, every so
> often there is a sixteenth one and R doesn’t like this and gives me
> an error message

Does the fill = TRUE argument of read.table() help?

If not, could you construct and share a small file with the same kind
of problem (16th field) but without the data one has to apply for
access to? (E.g. cut out a few lines from the original file, then
replace all digits.)

-- 
Best regards,
Ivan

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How long does it take to learn the R programming language?

2022-09-29 Thread Ebert,Timothy Aaron
Learning R takes an hour. Find an hourglass, flip it over. Meanwhile we will 
start increasing the size of the upper chamber and adding more sand. 

Mastery of R is an asymptotic function of time. 

While such answers might indicate trying for mastery is futile, you can learn 
enough R to be very useful long before "mastery."

Tim
-Original Message-
From: R-help  On Behalf Of Avi Gross
Sent: Wednesday, September 28, 2022 5:51 PM
To: John Kane 
Cc: R. Mailing List 
Subject: Re: [R] How long does it take to learn the R programming language?

[External Email]

So is the proper R answer simply Inf?

On Wed, Sep 28, 2022, 5:39 PM John Kane  wrote:

> + 1
>
> On Wed, 28 Sept 2022 at 17:36, Jim Lemon  wrote:
>
> > Given some of the questions that are posted to this list, I am not 
> > sure that there is an upper bound to the estimate.
> >
> > Jim
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fst
> > at.ethz.ch%2Fmailman%2Flistinfo%2Fr-helpdata=05%7C01%7Ctebert%4
> > 0ufl.edu%7C7229f6c17d764bd2742c08daa19bb65b%7C0d4da0f84a314d76ace60a
> > 62331e1b84%7C0%7C0%7C63787396320713%7CUnknown%7CTWFpbGZsb3d8eyJW
> > IjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C300
> > 0%7C%7C%7Csdata=8KNANsIMtWiElOAwn9pXvx%2BsueyNn329VkvFFx8Paew%3
> > Dreserved=0
> > PLEASE do read the posting guide
> > https://nam10.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww
> > .r-project.org%2Fposting-guide.htmldata=05%7C01%7Ctebert%40ufl.
> > edu%7C7229f6c17d764bd2742c08daa19bb65b%7C0d4da0f84a314d76ace60a62331
> > e1b84%7C0%7C0%7C63787396320713%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiM
> > C4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%
> > 7C%7Csdata=32nVjz3UeC4QK7dd2PHA76BywkYQP9ucuN%2FWFFAUX8k%3D
> > ;reserved=0 and provide commented, minimal, self-contained, 
> > reproducible code.
> >
>
>
> --
> John Kane
> Kingston ON Canada
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat
> .ethz.ch%2Fmailman%2Flistinfo%2Fr-helpdata=05%7C01%7Ctebert%40ufl
> .edu%7C7229f6c17d764bd2742c08daa19bb65b%7C0d4da0f84a314d76ace60a62331e
> 1b84%7C0%7C0%7C63787396320713%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4w
> LjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C
> sdata=8KNANsIMtWiElOAwn9pXvx%2BsueyNn329VkvFFx8Paew%3Dreserv
> ed=0
> PLEASE do read the posting guide
> https://nam10.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r
> -project.org%2Fposting-guide.htmldata=05%7C01%7Ctebert%40ufl.edu%
> 7C7229f6c17d764bd2742c08daa19bb65b%7C0d4da0f84a314d76ace60a62331e1b84%
> 7C0%7C0%7C63787396320713%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwM
> DAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C
> sdata=32nVjz3UeC4QK7dd2PHA76BywkYQP9ucuN%2FWFFAUX8k%3Dreserved=0
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-helpdata=05%7C01%7Ctebert%40ufl.edu%7C7229f6c17d764bd2742c08daa19bb65b%7C0d4da0f84a314d76ace60a62331e1b84%7C0%7C0%7C63787396320713%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=8KNANsIMtWiElOAwn9pXvx%2BsueyNn329VkvFFx8Paew%3Dreserved=0
PLEASE do read the posting guide 
https://nam10.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r-project.org%2Fposting-guide.htmldata=05%7C01%7Ctebert%40ufl.edu%7C7229f6c17d764bd2742c08daa19bb65b%7C0d4da0f84a314d76ace60a62331e1b84%7C0%7C0%7C63787396320713%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=32nVjz3UeC4QK7dd2PHA76BywkYQP9ucuN%2FWFFAUX8k%3Dreserved=0
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Reading very large text files into R

2022-09-29 Thread Nick Wray
Hello   I may be offending the R purists with this question but it is
linked to R, as will become clear.  I have very large data sets from the UK
Met Office in notepad form.  Unfortunately,  I can’t read them directly
into R because, for some reason, although most lines in the text doc
consist of 15 elements, every so often there is a sixteenth one and R
doesn’t like this and gives me an error message because it has assumed that
every line has 15 elements and doesn’t like finding one with more.  I have
tried playing around with the text document, inserting an extra element
into the top line etc, but to no avail.

Also unfortunately you need access permission from the Met Office to get
the files in question so this link probably won’t work:

https://catalogue.ceda.ac.uk/uuid/bbd6916225e7475514e17fdbf11141c1

So what I have done is simply to copy and paste the text docs into excel
csv and then read them in, which is time-consuming but works.  However the
later datasets are over the excel limit of 1048576 lines.  I can paste in
the first 1048576 lines but then trying to isolate the remainder of the
text doc to paste it into a second csv doc is proving v difficult – the
only way I have found is to scroll down by hand and that’s taking ages.  I
cannot find another way of editing the notepad text doc to get rid of the
part which I have already copied and pasted.

Can anyone help with a)ideally being able to simply read the text tables
into R  or b)suggest a way of editing out the bits of the text file I have
already pasted in without laborious scrolling?

Thanks Nick Wray

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Question about Line Ending Choice

2022-09-29 Thread Enrico Schumann
On Tue, 27 Sep 2022, Stephen H. Dawson, DSL via R-help writes:

> Hi All,
>
>
> I am writing with a question about choosing the line
> ending aspect of a file, please.
>
> I use write.csv and write.table to export work to CSV
> files and TXT files. I am planning now on how to share
> my work with the Windows crowd beyond only sharing with
> the Linux crowd. I use my text editor to flip the line
> ending option from Linux to Windows after
> exporting. This is inefficient for me to accomplish if
> I ramp up production as I expect will occur.
>
> Staying with the character encoding of UTF-8 seems fine
> for now from what I understand I need to deliver to my
> customers.
>
> What seems more efficient to me is to learn how to use
> R to define the line ending aspect of the exported
> file. I have not found if this is an option within R.
>
> QUESTION
> Is it possible within R to define the line ending aspect of file output?
>
>
> Kindest Regards,

Just a remark: there is a "standard" for CSV,
https://datatracker.ietf.org/doc/html/rfc4180.
It always requires CRLF as the line ending.

-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.