Re: [R] splitting a vector of strings

2016-07-21 Thread Michael Dewey

Dear Eric

I think you are looking for sub or gsub

Without an example set of input and output I am not quite sure but you 
would need to define an expression which matches your separator (;) 
followed by any characters up to the end of line. If you have trouble 
with that then someone here will no doubt write the pattern for you but 
learning about regular expressions is well worthwhile


On 21/07/2016 12:54, Eric Elguero wrote:

Hi everybody,

I have a vector of character strings.
Each string has the same pattern and I want
to split them in pieces and get a vector made
of the first pieces of each string.

The problem is that strsplit returns a list.

All I found is

uu<- matrix(unlist(strsplit(x,";")),ncol=3,byrow=T)[,1]

where x is the vector ";" is the delimiting character
and I know that each string will be cut in 3 pieces.

That works for my problem but I would prefer a
more elegant solution. Besides, it would not
work if all the string didn't have the same
number of pieces.

does someone have a better solution?

sorry if that topic was discussed recently.
There is too much traffic on the r-help list,
I cannot catch up.



--
Michael
http://www.dewey.myzen.co.uk/home.html

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] splitting a vector of strings

2016-07-21 Thread Ben Tupper
Hi,

I'm not sure about the more generalized solution, but how about this for a 
start.


x <- c("a;b;c", "d;e", "foo;g;h;i")
x
#[1] "a;b;c" "d;e"   "foo;g;h;i"

sapply(strsplit(x, ";",fixed = TRUE), '[',1)
#[1] "a"   "d"   "foo"

If you want elegance then I suggest you take a look at the stringr package. 

https://cran.r-project.org/web/packages/stringr/index.html

Cheers,
Ben


> On Jul 21, 2016, at 7:54 AM, Eric Elguero  wrote:
> 
> Hi everybody,
> 
> I have a vector of character strings.
> Each string has the same pattern and I want
> to split them in pieces and get a vector made
> of the first pieces of each string.
> 
> The problem is that strsplit returns a list.
> 
> All I found is
> 
> uu<- matrix(unlist(strsplit(x,";")),ncol=3,byrow=T)[,1]
> 
> where x is the vector ";" is the delimiting character
> and I know that each string will be cut in 3 pieces.
> 
> That works for my problem but I would prefer a
> more elegant solution. Besides, it would not
> work if all the string didn't have the same
> number of pieces.
> 
> does someone have a better solution?
> 
> sorry if that topic was discussed recently.
> There is too much traffic on the r-help list,
> I cannot catch up.
> 
> -- 
> Eric Elguero
> 
> MIVEGEC. - UMR (CNRS/IRD/UM) 5290
> Maladies Infectieuses et Vecteurs, Génétique, Evolution et Contrôle
> Institut de Recherche pour le Développement (IRD)
> 911, Avenue Agropolis
> BP 64501
> 34394 Montpellier Cedex 5, France
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Ben Tupper
Bigelow Laboratory for Ocean Sciences
60 Bigelow Drive, P.O. Box 380
East Boothbay, Maine 04544
http://www.bigelow.org

Report Gulf of Maine jellyfish sightings to jellyf...@bigelow.org or tweet them 
to #MaineJellies -- include date, time, and location, as well as any 
descriptive information such as size or type.  Learn more at 
https://www.bigelow.org/research/srs/nick-record/nick-record-laboratory/mainejellies/

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] splitting a vector of strings

2016-07-21 Thread Eric Elguero

Hi everybody,

I have a vector of character strings.
Each string has the same pattern and I want
to split them in pieces and get a vector made
of the first pieces of each string.

The problem is that strsplit returns a list.

All I found is

uu<- matrix(unlist(strsplit(x,";")),ncol=3,byrow=T)[,1]

where x is the vector ";" is the delimiting character
and I know that each string will be cut in 3 pieces.

That works for my problem but I would prefer a
more elegant solution. Besides, it would not
work if all the string didn't have the same
number of pieces.

does someone have a better solution?

sorry if that topic was discussed recently.
There is too much traffic on the r-help list,
I cannot catch up.

--
Eric Elguero

MIVEGEC. - UMR (CNRS/IRD/UM) 5290
Maladies Infectieuses et Vecteurs, Génétique, Evolution et Contrôle
Institut de Recherche pour le Développement (IRD)
911, Avenue Agropolis
BP 64501
34394 Montpellier Cedex 5, France

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Splitting Numerical Vector Into Chunks

2016-04-25 Thread PIKAL Petr
Hi

Is this

http://stackoverflow.com/questions/2150138/how-to-parse-milliseconds-in-r

what do you want?

Cheers
Petr


> -Original Message-
> From: Sidoti, Salvatore A. [mailto:sidoti...@buckeyemail.osu.edu]
> Sent: Sunday, April 24, 2016 1:48 AM
> To: PIKAL Petr <petr.pi...@precheza.cz>; William Dunlap
> <wdun...@tibco.com>; Ista Zahn <istaz...@gmail.com>
> Subject: RE: [R] Splitting Numerical Vector Into Chunks
>
> There are terrific suggestions and I so appreciate everyone's help!
>
> Just one additional question: I also have some time data that accompanies
> this analysis. It is in the format h:m:s:00 where the last item is hundreths 
> of a
> second. I tried various things with the date and time functions in R, but I 
> have
> not found a way to convert the vetor into a workable time class.
>
> -Original Message-
> From: PIKAL Petr [mailto:petr.pi...@precheza.cz]
> Sent: Thursday, April 21, 2016 4:13 AM
> To: William Dunlap <wdun...@tibco.com>; Ista Zahn <istaz...@gmail.com>
> Cc: Sidoti, Salvatore A. <sidoti...@buckeyemail.osu.edu>
> Subject: RE: [R] Splitting Numerical Vector Into Chunks
>
> Hi
>
> Another aproach is to use diff to find steps.
>
> split(x, cumsum(abs(c(1,diff(x==0)
>
> Cheers
> Petr
>
>
> > -Original Message-
> > From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of
> > William Dunlap via R-help
> > Sent: Wednesday, April 20, 2016 9:56 PM
> > To: Ista Zahn <istaz...@gmail.com>
> > Cc: r-help@r-project.org; Sidoti, Salvatore A.
> > <sidoti...@buckeyemail.osu.edu>
> > Subject: Re: [R] Splitting Numerical Vector Into Chunks
> >
> > > i <- seq_len(length(x)-1)
> > > split(x, cumsum(c(TRUE, (x[i]==0) != (x[i+1]==0
> > $`1`
> > [1] 0.144872972504 0.850797178400
> >
> > $`2`
> > [1] 0 0 0
> >
> > $`3`
> > [1] 0.199304859380 2.063609410700 0.939393760782 0.838781367540
> >
> > $`4`
> > [1] 0 0 0 0 0
> >
> > $`5`
> > [1] 0.374688091264 0.488423999452 0.783034615362 0.626990428900
> > 0.138188255307 2.324635712186
> >
> > $`6`
> > [1] 0 0 0 0 0 0 0
> >
> >
> > Bill Dunlap
> > TIBCO Software
> > wdunlap tibco.com
> >
> > On Wed, Apr 20, 2016 at 12:49 PM, Ista Zahn <istaz...@gmail.com> wrote:
> >
> > > Perhaps
> > >
> > > x <- split(x, x == 0)
> > >
> > > Best,
> > > Ista
> > >
> > > On Wed, Apr 20, 2016 at 9:40 AM, Sidoti, Salvatore A.
> > > <sidoti...@buckeyemail.osu.edu> wrote:
> > > > Greetings!
> > > >
> > > > I have several large data sets of animal movements. Their pauses
> > > > (zero
> > > magnitude vectors) are of particular interest in addition to the
> > > speed distributions that precede the periods of rest. Here is an
> > > example of the kind of data I am interested in analyzing:
> > > >
> > > > x <-
> > > abs(c(rnorm(2),replicate(3,0),rnorm(4),replicate(5,0),rnorm(6),repli
> > > ca
> > > te(7,0)))
> > > > length(x)
> > > >
> > > > This example has 27 elements with strings of zeroes (pauses)
> > > > situated
> > > among the speed values.
> > > > Is there a way to split the vector into zero and nonzero chunks
> > > > and
> > > store them in a form where they can be analyzed? I have tried
> > > various forms of split() to no avail.
> > > >
> > > > Thank you!
> > > > Salvatore A. Sidoti
> > > >
> > > > __
> > > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > > PLEASE do read the posting guide
> > > http://www.R-project.org/posting-guide.html
> > > > and provide commented, minimal, self-contained, reproducible code.
> > >
> > > __
> > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> > >
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCR

Re: [R] Splitting Numerical Vector Into Chunks

2016-04-20 Thread William Dunlap via R-help
> i <- seq_len(length(x)-1)
> split(x, cumsum(c(TRUE, (x[i]==0) != (x[i+1]==0
$`1`
[1] 0.144872972504 0.850797178400

$`2`
[1] 0 0 0

$`3`
[1] 0.199304859380 2.063609410700 0.939393760782 0.838781367540

$`4`
[1] 0 0 0 0 0

$`5`
[1] 0.374688091264 0.488423999452 0.783034615362 0.626990428900
0.138188255307 2.324635712186

$`6`
[1] 0 0 0 0 0 0 0


Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Wed, Apr 20, 2016 at 12:49 PM, Ista Zahn  wrote:

> Perhaps
>
> x <- split(x, x == 0)
>
> Best,
> Ista
>
> On Wed, Apr 20, 2016 at 9:40 AM, Sidoti, Salvatore A.
>  wrote:
> > Greetings!
> >
> > I have several large data sets of animal movements. Their pauses (zero
> magnitude vectors) are of particular interest in addition to the speed
> distributions that precede the periods of rest. Here is an example of the
> kind of data I am interested in analyzing:
> >
> > x <-
> abs(c(rnorm(2),replicate(3,0),rnorm(4),replicate(5,0),rnorm(6),replicate(7,0)))
> > length(x)
> >
> > This example has 27 elements with strings of zeroes (pauses) situated
> among the speed values.
> > Is there a way to split the vector into zero and nonzero chunks and
> store them in a form where they can be analyzed? I have tried various forms
> of split() to no avail.
> >
> > Thank you!
> > Salvatore A. Sidoti
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Splitting Numerical Vector Into Chunks

2016-04-20 Thread Ista Zahn
Perhaps

x <- split(x, x == 0)

Best,
Ista

On Wed, Apr 20, 2016 at 9:40 AM, Sidoti, Salvatore A.
 wrote:
> Greetings!
>
> I have several large data sets of animal movements. Their pauses (zero 
> magnitude vectors) are of particular interest in addition to the speed 
> distributions that precede the periods of rest. Here is an example of the 
> kind of data I am interested in analyzing:
>
> x <- 
> abs(c(rnorm(2),replicate(3,0),rnorm(4),replicate(5,0),rnorm(6),replicate(7,0)))
> length(x)
>
> This example has 27 elements with strings of zeroes (pauses) situated among 
> the speed values.
> Is there a way to split the vector into zero and nonzero chunks and store 
> them in a form where they can be analyzed? I have tried various forms of 
> split() to no avail.
>
> Thank you!
> Salvatore A. Sidoti
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Splitting Numerical Vector Into Chunks

2016-04-20 Thread Sidoti, Salvatore A.
Greetings!

I have several large data sets of animal movements. Their pauses (zero 
magnitude vectors) are of particular interest in addition to the speed 
distributions that precede the periods of rest. Here is an example of the kind 
of data I am interested in analyzing:

x <- 
abs(c(rnorm(2),replicate(3,0),rnorm(4),replicate(5,0),rnorm(6),replicate(7,0)))
length(x)

This example has 27 elements with strings of zeroes (pauses) situated among the 
speed values.
Is there a way to split the vector into zero and nonzero chunks and store them 
in a form where they can be analyzed? I have tried various forms of split() to 
no avail.

Thank you!
Salvatore A. Sidoti

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Splitting a vector into data frame

2016-03-24 Thread Ivan Calandra

Hi!

As Boris explained, if you do not always have the same number of values 
per country, you need to provide more details, e.g. should the empty 
cells be filled with NA?


But if you do always have 20 values per country (unlike in your sample 
data), then this could work for you:

mydf <- data.frame(matrix(temp.data, nrow=2, ncol=22, byrow=TRUE))
You can then subset to remove the 1st column:
mydf[-1]

HTH,
Ivan

--
Ivan Calandra, PhD
University of Reims Champagne-Ardenne
GEGENAA - EA 3795
CREA - 2 esplanade Roland Garros
51100 Reims, France
+33(0)3 26 77 36 89
ivan.calan...@univ-reims.fr
--
https://www.researchgate.net/profile/Ivan_Calandra
https://publons.com/author/705639/

Le 24/03/2016 11:30, Burhan ul haq a écrit :

Hi,

1. I have scraped some data from the web, subset shown below


dput(temp.data)

c("Armenia", "Armenia", "43827", "39200", "35700", "36700", "39341",
"30571", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0", " 0",
"0", "0", "0", "0", "Austria", "Austria", "135417", "166200",
"144500", "147300", "163211", "162536", "155412", "133667", "134962",
"146440", "131188", "11", "10", "8", "35000")

2. The corresponding list of countries, is as follows


dput(raw.country)

c("Armenia", "Austria", "Belarus", "Belgium", "Brazil", "Bulgaria",
"Canada", "Castile-Leon (Hiszania)", "Catalonia", "Chile", "Colombia",
"Costarica", "Croatia", "Cyprus", "Czech Republic", "Ecuador",
"Estonia", "Finland", "France", "Georgia", "Germany", "Ghana",
"Greece", "Hungary", "Indonesia", "Iran", "Ireland", "Israel",
"Italy", "Kazakhstan", "Kyrgyzstan", "Latvia", "Lithuania", "Macedonia",
"Malaysia", "Mexico", "Moldova", "Mongolia", "Netherland", "Norway",
"Pakistan", "Panama", "Paraguay", "Peru", "Poland", "Portugal",
"Puertorico", "Romania", "Russia", "Serbia", "Slovakia", "Slovenia",
"Spain", "Sweden", "Switzerland", "Tunisia", "Ukraine", "United Kingdom",
"USA", "Venezuela", "Vltava", "World Total")


3. I want to organize the data into a data frame, where each row will
contain the 20 values for the corresponding country.
It needs to ignore the country name which appears twice.Something like:

Armenia "43827", "39200", "35700", "36700", "39341",
"30571", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0", " 0",
"0", "0", "0", "0",

"Austria", "135417", "166200",
"144500", "147300", "163211", "162536", "155412", "133667", "134962",
"146440", "131188", "11", "10", "8", "35000"

and so on


Thanks /

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Splitting a vector into data frame

2016-03-24 Thread Jim Lemon
Hi Burhan,
As all of your values seem to be character, perhaps:

country.df<-as.data.frame(matrix(temp.data,ncol=22,byrow=TRUE)[,2:21])

if there really are 2 country names and 20 values for each country. As
Boris has pointed out, there are different numbers of values following
the country names in your example.

Jim


On Thu, Mar 24, 2016 at 9:30 PM, Burhan ul haq  wrote:
> Hi,
>
> 1. I have scraped some data from the web, subset shown below
>
>> dput(temp.data)
> c("Armenia", "Armenia", "43827", "39200", "35700", "36700", "39341",
> "30571", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0", " 0",
> "0", "0", "0", "0", "Austria", "Austria", "135417", "166200",
> "144500", "147300", "163211", "162536", "155412", "133667", "134962",
> "146440", "131188", "11", "10", "8", "35000")
>
> 2. The corresponding list of countries, is as follows
>
>> dput(raw.country)
> c("Armenia", "Austria", "Belarus", "Belgium", "Brazil", "Bulgaria",
> "Canada", "Castile-Leon (Hiszania)", "Catalonia", "Chile", "Colombia",
> "Costarica", "Croatia", "Cyprus", "Czech Republic", "Ecuador",
> "Estonia", "Finland", "France", "Georgia", "Germany", "Ghana",
> "Greece", "Hungary", "Indonesia", "Iran", "Ireland", "Israel",
> "Italy", "Kazakhstan", "Kyrgyzstan", "Latvia", "Lithuania", "Macedonia",
> "Malaysia", "Mexico", "Moldova", "Mongolia", "Netherland", "Norway",
> "Pakistan", "Panama", "Paraguay", "Peru", "Poland", "Portugal",
> "Puertorico", "Romania", "Russia", "Serbia", "Slovakia", "Slovenia",
> "Spain", "Sweden", "Switzerland", "Tunisia", "Ukraine", "United Kingdom",
> "USA", "Venezuela", "Vltava", "World Total")
>
>
> 3. I want to organize the data into a data frame, where each row will
> contain the 20 values for the corresponding country.
> It needs to ignore the country name which appears twice.Something like:
>
> Armenia "43827", "39200", "35700", "36700", "39341",
> "30571", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0", " 0",
> "0", "0", "0", "0",
>
> "Austria", "135417", "166200",
> "144500", "147300", "163211", "162536", "155412", "133667", "134962",
> "146440", "131188", "11", "10", "8", "35000"
>
> and so on
>
>
> Thanks /
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Splitting a vector into data frame

2016-03-24 Thread Boris Steipe
Your data rows have different numbers of columns. Thus your problem is not 
sufficiently specified.

B. 
On Mar 24, 2016, at 6:30 AM, Burhan ul haq  wrote:

> Hi,
> 
> 1. I have scraped some data from the web, subset shown below
> 
>> dput(temp.data)
> c("Armenia", "Armenia", "43827", "39200", "35700", "36700", "39341",
> "30571", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0", " 0",
> "0", "0", "0", "0", "Austria", "Austria", "135417", "166200",
> "144500", "147300", "163211", "162536", "155412", "133667", "134962",
> "146440", "131188", "11", "10", "8", "35000")
> 
> 2. The corresponding list of countries, is as follows
> 
>> dput(raw.country)
> c("Armenia", "Austria", "Belarus", "Belgium", "Brazil", "Bulgaria",
> "Canada", "Castile-Leon (Hiszania)", "Catalonia", "Chile", "Colombia",
> "Costarica", "Croatia", "Cyprus", "Czech Republic", "Ecuador",
> "Estonia", "Finland", "France", "Georgia", "Germany", "Ghana",
> "Greece", "Hungary", "Indonesia", "Iran", "Ireland", "Israel",
> "Italy", "Kazakhstan", "Kyrgyzstan", "Latvia", "Lithuania", "Macedonia",
> "Malaysia", "Mexico", "Moldova", "Mongolia", "Netherland", "Norway",
> "Pakistan", "Panama", "Paraguay", "Peru", "Poland", "Portugal",
> "Puertorico", "Romania", "Russia", "Serbia", "Slovakia", "Slovenia",
> "Spain", "Sweden", "Switzerland", "Tunisia", "Ukraine", "United Kingdom",
> "USA", "Venezuela", "Vltava", "World Total")
> 
> 
> 3. I want to organize the data into a data frame, where each row will
> contain the 20 values for the corresponding country.
> It needs to ignore the country name which appears twice.Something like:
> 
> Armenia "43827", "39200", "35700", "36700", "39341",
> "30571", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0", " 0",
> "0", "0", "0", "0",
> 
> "Austria", "135417", "166200",
> "144500", "147300", "163211", "162536", "155412", "133667", "134962",
> "146440", "131188", "11", "10", "8", "35000"
> 
> and so on
> 
> 
> Thanks /
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Splitting a vector into data frame

2016-03-24 Thread Burhan ul haq
Hi,

1. I have scraped some data from the web, subset shown below

> dput(temp.data)
c("Armenia", "Armenia", "43827", "39200", "35700", "36700", "39341",
"30571", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0", " 0",
"0", "0", "0", "0", "Austria", "Austria", "135417", "166200",
"144500", "147300", "163211", "162536", "155412", "133667", "134962",
"146440", "131188", "11", "10", "8", "35000")

2. The corresponding list of countries, is as follows

> dput(raw.country)
c("Armenia", "Austria", "Belarus", "Belgium", "Brazil", "Bulgaria",
"Canada", "Castile-Leon (Hiszania)", "Catalonia", "Chile", "Colombia",
"Costarica", "Croatia", "Cyprus", "Czech Republic", "Ecuador",
"Estonia", "Finland", "France", "Georgia", "Germany", "Ghana",
"Greece", "Hungary", "Indonesia", "Iran", "Ireland", "Israel",
"Italy", "Kazakhstan", "Kyrgyzstan", "Latvia", "Lithuania", "Macedonia",
"Malaysia", "Mexico", "Moldova", "Mongolia", "Netherland", "Norway",
"Pakistan", "Panama", "Paraguay", "Peru", "Poland", "Portugal",
"Puertorico", "Romania", "Russia", "Serbia", "Slovakia", "Slovenia",
"Spain", "Sweden", "Switzerland", "Tunisia", "Ukraine", "United Kingdom",
"USA", "Venezuela", "Vltava", "World Total")


3. I want to organize the data into a data frame, where each row will
contain the 20 values for the corresponding country.
It needs to ignore the country name which appears twice.Something like:

Armenia "43827", "39200", "35700", "36700", "39341",
"30571", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0", " 0",
"0", "0", "0", "0",

"Austria", "135417", "166200",
"144500", "147300", "163211", "162536", "155412", "133667", "134962",
"146440", "131188", "11", "10", "8", "35000"

and so on


Thanks /

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Splitting a vector

2013-12-11 Thread Gerrit Eichner

Hello, Long Vo,

take a look at the help page of split or directly at

str(Y)

They tell you that Y is a list, and list components are indexed using 
[[:


mean(Y[[4]])

should do what you want.

  Regards  --  Gerrit



This does what I needed. However, as the output is a list object, is there
any way to apply a function to such object? For example if I want to compute
the mean for the 4th subvectors, I can't simply use:
#
Y=split(X,as.numeric(gl(length(X),3,length(X
mean(Y[4])
#
as the error message shows argument is not numeric or logical



--
View this message in context: 
http://r.789695.n4.nabble.com/Splitting-a-vector-tp4681930p4681987.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Splitting a vector

2013-12-10 Thread Jeff Newmiller
You don't need to wrap 1:12 in c().

 Since matrices are just folded vectors, you can convert vector X to a matrix 
Xm:

Xm - matrix( X, nrow=3 )

and access columns to get your your sub-vectors:

Xm[,1]
Xm[,2]

and so on.

---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

Long Vo long_vo_...@yahoo.com.vn wrote:
Hi, I am quite new to R so I know that this probably is very basic ,
but how
can I split a sequence of number into multiple parts with equal length?
For example I have a vector

X=c(1:12)
I simply need to split it into sub-vectors with the same length N . Say
N=3
then I need the output to be like 
1 2 3
4 5 6
7 8 9
10 11 12

And better if the sub-vectors can be named so that I can use them later
for
individual study, probably a do-loop in which a function can be applied
to
them.
I just want them to be in consecutive order so really no fancy
conditions
here.  

Any helps to this amateur is greatly appreciated,
Long



--
View this message in context:
http://r.789695.n4.nabble.com/Splitting-a-vector-tp4681930.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Splitting a vector

2013-12-10 Thread Long Vo
This does what I needed. However, as the output is a list object, is there
any way to apply a function to such object? For example if I want to compute
the mean for the 4th subvectors, I can't simply use:
#
Y=split(X,as.numeric(gl(length(X),3,length(X
mean(Y[4])
# 
as the error message shows argument is not numeric or logical



--
View this message in context: 
http://r.789695.n4.nabble.com/Splitting-a-vector-tp4681930p4681987.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Splitting a vector

2013-12-09 Thread Long Vo
Hi, I am quite new to R so I know that this probably is very basic , but how
can I split a sequence of number into multiple parts with equal length?
For example I have a vector

X=c(1:12)
I simply need to split it into sub-vectors with the same length N . Say N=3
then I need the output to be like 
1 2 3
4 5 6
7 8 9
10 11 12

And better if the sub-vectors can be named so that I can use them later for
individual study, probably a do-loop in which a function can be applied to
them.
I just want them to be in consecutive order so really no fancy conditions
here.  

Any helps to this amateur is greatly appreciated,
Long



--
View this message in context: 
http://r.789695.n4.nabble.com/Splitting-a-vector-tp4681930.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Splitting a vector

2013-12-09 Thread arun
Hi,
Try:
split(X,as.numeric(gl(length(X),3,length(X
A.K.


Hi, I am quite new to R so I know that this probably is very basic , but how 
can I split a sequence of number into multiple parts with equal 
length? 
For example I have a vector 

X=c(1:12) 
I simply need to split it into sub-vectors with the same length N . Say N=3 
then I need the output to be like 
1 2 3 
4 5 6 
7 8 9 
10 11 12 

And better if the sub-vectors can be named so that I can use 
them later for individual study, probably a do-loop in which a function 
can be applied to them. 
I just want them to be in consecutive order so really no fancy conditions here. 
  

Any helps to this amateur is greatly appreciated, 
Long
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] splitting a vector

2012-08-02 Thread Jan van der Laan

I come up with:

runs - function(numbers) {
tmp - diff(c(0, which(diff(numbers) = 0), length(numbers)))
split(numbers, rep(seq_along(tmp), tmp))
}



Can't say it's elegant, but it seems to work


runs(c(1:3, 1:4))

$`1`
[1] 1 2 3

$`2`
[1] 1 2 3 4


runs(c(1,1,1))

$`1`
[1] 1

$`2`
[1] 1

$`3`
[1] 1


runs(c(1:3, 2:3, 3))

$`1`
[1] 1 2 3

$`2`
[1] 2 3

$`3`
[1] 3


HTH,

Jan



capy_bara hettl...@few.vu.nl schreef:


Hello,

I have a vector with positive integer numbers, e.g.


numbers - c(1,2,1,2,3,4,5)


and want to split the vector whenever an element in the vector is smaller or
equal to its predecessor.
Hence I want to obtain two vectors: c(1,2) and c(1,2,3,4,5).
I tried with which(), but it is not so elegant:


numbers[1:(which(numbers=numbers[1])[2]-1)]
numbers[which(numbers=numbers[1])[2]:length(numbers)]


Sure I can do it with a for-loop, but that seems a bit tedious for that
small problem.
Does maybe anyone know a simple and elegant solution for this? I'm searching
for a general solution, since
my vector may change and maybe be split into more than two vectors, e.g.
give five vectors for c(1,1,2,3,4,5,1,2,3,2,3,4,5,6,4,5).

Many thanks in advance,

Hannes







--
View this message in context:  
http://r.789695.n4.nabble.com/splitting-a-vector-tp4638675.html

Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] splitting a vector

2012-08-01 Thread capy_bara
Hello,

I have a vector with positive integer numbers, e.g. 

 numbers - c(1,2,1,2,3,4,5)

and want to split the vector whenever an element in the vector is smaller or
equal to its predecessor. 
Hence I want to obtain two vectors: c(1,2) and c(1,2,3,4,5).
I tried with which(), but it is not so elegant:

 numbers[1:(which(numbers=numbers[1])[2]-1)]
 numbers[which(numbers=numbers[1])[2]:length(numbers)]

Sure I can do it with a for-loop, but that seems a bit tedious for that
small problem. 
Does maybe anyone know a simple and elegant solution for this? I'm searching
for a general solution, since
my vector may change and maybe be split into more than two vectors, e.g.
give five vectors for c(1,1,2,3,4,5,1,2,3,2,3,4,5,6,4,5).

Many thanks in advance,

Hannes







--
View this message in context: 
http://r.789695.n4.nabble.com/splitting-a-vector-tp4638675.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] splitting a vector

2012-08-01 Thread Rui Barradas

Hello,

Try the following.


fun - function(x){
n.diff - cumsum(diff(c(x[1], x)) = 0)
split(x, n.diff)
}

numbers - c(1,2,1,2,3,4,5)
fun(numbers)

fun(  c(1,1,2,3,4,5,1,2,3,2,3,4,5,6,4,5) )

Hope this helps,

Rui Barradas
Em 01-08-2012 14:29, capy_bara escreveu:

Hello,

I have a vector with positive integer numbers, e.g.


numbers - c(1,2,1,2,3,4,5)

and want to split the vector whenever an element in the vector is smaller or
equal to its predecessor.
Hence I want to obtain two vectors: c(1,2) and c(1,2,3,4,5).
I tried with which(), but it is not so elegant:


numbers[1:(which(numbers=numbers[1])[2]-1)]
numbers[which(numbers=numbers[1])[2]:length(numbers)]

Sure I can do it with a for-loop, but that seems a bit tedious for that
small problem.
Does maybe anyone know a simple and elegant solution for this? I'm searching
for a general solution, since
my vector may change and maybe be split into more than two vectors, e.g.
give five vectors for c(1,1,2,3,4,5,1,2,3,2,3,4,5,6,4,5).

Many thanks in advance,

Hannes







--
View this message in context: 
http://r.789695.n4.nabble.com/splitting-a-vector-tp4638675.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] splitting a vector

2012-08-01 Thread William Dunlap
The following 'f' counts the number of times the sequence x
does not increase.  Is this what you want?

 f - function(x) split(x, cumsum(c(TRUE, x[-1] = x[-length(x)])))
 f(numbers)
$`1`
[1] 1 2

$`2`
[1] 1 2 3 4 5

 f(c(1,1,2,3,4,5,1,2,3,2,3,4,5,6,4,5))
$`1`
[1] 1

$`2`
[1] 1 2 3 4 5

$`3`
[1] 1 2 3

$`4`
[1] 2 3 4 5 6

$`5`
[1] 4 5


Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
 Behalf
 Of capy_bara
 Sent: Wednesday, August 01, 2012 6:30 AM
 To: r-help@r-project.org
 Subject: [R] splitting a vector
 
 Hello,
 
 I have a vector with positive integer numbers, e.g.
 
  numbers - c(1,2,1,2,3,4,5)
 
 and want to split the vector whenever an element in the vector is smaller or
 equal to its predecessor.
 Hence I want to obtain two vectors: c(1,2) and c(1,2,3,4,5).
 I tried with which(), but it is not so elegant:
 
  numbers[1:(which(numbers=numbers[1])[2]-1)]
  numbers[which(numbers=numbers[1])[2]:length(numbers)]
 
 Sure I can do it with a for-loop, but that seems a bit tedious for that
 small problem.
 Does maybe anyone know a simple and elegant solution for this? I'm searching
 for a general solution, since
 my vector may change and maybe be split into more than two vectors, e.g.
 give five vectors for c(1,1,2,3,4,5,1,2,3,2,3,4,5,6,4,5).
 
 Many thanks in advance,
 
 Hannes
 
 
 
 
 
 
 
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/splitting-a-vector-
 tp4638675.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Splitting a Vector

2011-01-06 Thread Ben Ward

Hi all,

I read in a text book, that you can examine a variable that is colinear 
with others, and giving different ANOVA output and explanatory power 
when ordered differently in the model forula, by modelling that 
explanatory variable, against the others colinear with it. Then, using 
that information to split the vector (explanatory variable) in question, 
into two new vectors, one should correspond to the fitted values and one 
the residuals of the (I think you could call it nested) model. One 
vector therefore should be aligned with the subspacespace defined by the 
other variables colinear with it, and the other will be residual, and so 
orthogonal to the subspace of the colinear variables. Then by including 
these two variables in the origional model - the one that showed the 
order dependency, you can see how much explanatory power the othogonal 
part of the order dependent variable has, at different orders, and in 
principle it shouldn't change, but the vector made from the part 
co-aligned with the co-variates, will change as the order changes - it's 
explanatory power should decreace in ANOVA is it moves away from being 
the first explanatory variable in the model.


Obviously finding the fitted model values and residual required to split 
the vector in two is a simple lm() with the right variables. But how 
would I create two new vectors from this and append them to my 
dataframe? Is there a package or function specially designed with this 
sort of task in mind?


Thanks,
Ben Ward.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Splitting a Vector

2011-01-06 Thread Greg Snow
I think that you are looking for the 'resid' and 'fitted' functions, these will 
give you the residuals and fitted values from an lm object (that added together 
gives the original response but are orthogonal to each other).  Those values 
can then be assigned to a data frame or used by themselves.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of Ben Ward
 Sent: Thursday, January 06, 2011 6:09 AM
 To: r-help
 Subject: [R] Splitting a Vector
 
 Hi all,
 
 I read in a text book, that you can examine a variable that is colinear
 with others, and giving different ANOVA output and explanatory power
 when ordered differently in the model forula, by modelling that
 explanatory variable, against the others colinear with it. Then, using
 that information to split the vector (explanatory variable) in
 question,
 into two new vectors, one should correspond to the fitted values and
 one
 the residuals of the (I think you could call it nested) model. One
 vector therefore should be aligned with the subspacespace defined by
 the
 other variables colinear with it, and the other will be residual, and
 so
 orthogonal to the subspace of the colinear variables. Then by including
 these two variables in the origional model - the one that showed the
 order dependency, you can see how much explanatory power the othogonal
 part of the order dependent variable has, at different orders, and in
 principle it shouldn't change, but the vector made from the part
 co-aligned with the co-variates, will change as the order changes -
 it's
 explanatory power should decreace in ANOVA is it moves away from being
 the first explanatory variable in the model.
 
 Obviously finding the fitted model values and residual required to
 split
 the vector in two is a simple lm() with the right variables. But how
 would I create two new vectors from this and append them to my
 dataframe? Is there a package or function specially designed with this
 sort of task in mind?
 
 Thanks,
 Ben Ward.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] splitting a vector of strings...

2009-10-22 Thread Jonathan Greenberg
Quick question -- if I have a vector of strings that I'd like to split 
into two new vectors based on a substring that is inside of each string, 
what is the most efficient way to do this?  The substring that I want to 
split on is multiple characters, if that matters, and it is contained in 
every element of the character vector.


--j

--

Jonathan A. Greenberg, PhD
Postdoctoral Scholar
Center for Spatial Technologies and Remote Sensing (CSTARS)
University of California, Davis
One Shields Avenue
The Barn, Room 250N
Davis, CA 95616
Phone: 415-763-5476
AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] splitting a vector of strings...

2009-10-22 Thread William Dunlap
 -Original Message-
 From: r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.org] On Behalf Of Jonathan Greenberg
 Sent: Thursday, October 22, 2009 7:35 PM
 To: r-help
 Subject: [R] splitting a vector of strings...
 
 Quick question -- if I have a vector of strings that I'd like 
 to split 
 into two new vectors based on a substring that is inside of 
 each string, 
 what is the most efficient way to do this?  The substring 
 that I want to 
 split on is multiple characters, if that matters, and it is 
 contained in 
 every element of the character vector.

strsplit and sub can both be used for this.  If you know
the string will be split into 2 parts then 2 calls to sub
with slightly different patterns will do it.  strsplit requires
less fiddling with the pattern and is handier when the number
of parts is variable or large.  strsplit's output often needs to
be rearranged for convenient use.

E.g., I made 100,000 strings with a 'qaz' in their middles with
  x-paste(X,sample(1e5),sep=)
  y-sub(X,Y,x)
  xy-paste(x,y,sep=qaz)
and split them by the 'qaz' in two ways:
  system.time(ret1-list(x=sub(qaz.*,,xy),y=sub(.*qaz,,xy)))
  # user  system elapsed 
  # 0.220.000.21 
 
system.time({tmp-strsplit(xy,qaz);ret2-list(x=unlist(lapply(tmp,`[`,
1)),y=unlist(lapply(tmp,`[`,2)))})
   user  system elapsed 
  # 2.420.002.20 
  identical(ret1,ret2)
  #[1] TRUE
  identical(ret1$x,x)  identical(ret1$y,y)
  #[1] TRUE

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com 

 
 --j
 
 -- 
 
 Jonathan A. Greenberg, PhD
 Postdoctoral Scholar
 Center for Spatial Technologies and Remote Sensing (CSTARS)
 University of California, Davis
 One Shields Avenue
 The Barn, Room 250N
 Davis, CA 95616
 Phone: 415-763-5476
 AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] splitting a vector of strings...

2009-10-22 Thread andrew
xs - this is string
xsv - paste(xs, 1:10)
sapply(xsv, function(x) strsplit(x, '\\sis\\s'))

This will split the vector of string xsv on the word 'is' that has a
space immediately before and after it.



On Oct 23, 1:34 pm, Jonathan Greenberg greenb...@ucdavis.edu wrote:
 Quick question -- if I have a vector of strings that I'd like to split
 into two new vectors based on a substring that is inside of each string,
 what is the most efficient way to do this?  The substring that I want to
 split on is multiple characters, if that matters, and it is contained in
 every element of the character vector.

 --j

 --

 Jonathan A. Greenberg, PhD
 Postdoctoral Scholar
 Center for Spatial Technologies and Remote Sensing (CSTARS)
 University of California, Davis
 One Shields Avenue
 The Barn, Room 250N
 Davis, CA 95616
 Phone: 415-763-5476
 AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307

 __
 r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] splitting a vector of strings...

2009-10-22 Thread Jonathan Greenberg

William et al:

   Thanks!  I think I have a somewhat more complicated issue due to the 
type of string I'm using -- the split is  |  (space pipe space) -- how 
do I code that based on your sub code below?  Using  | * doesn't seem 
to be working.  Thanks!


--j

William Dunlap wrote:

-Original Message-
From: r-help-boun...@r-project.org 
[mailto:r-help-boun...@r-project.org] On Behalf Of Jonathan Greenberg

Sent: Thursday, October 22, 2009 7:35 PM
To: r-help
Subject: [R] splitting a vector of strings...

Quick question -- if I have a vector of strings that I'd like 
to split 
into two new vectors based on a substring that is inside of 
each string, 
what is the most efficient way to do this?  The substring 
that I want to 
split on is multiple characters, if that matters, and it is 
contained in 
every element of the character vector.



strsplit and sub can both be used for this.  If you know
the string will be split into 2 parts then 2 calls to sub
with slightly different patterns will do it.  strsplit requires
less fiddling with the pattern and is handier when the number
of parts is variable or large.  strsplit's output often needs to
be rearranged for convenient use.

E.g., I made 100,000 strings with a 'qaz' in their middles with
  x-paste(X,sample(1e5),sep=)
  y-sub(X,Y,x)
  xy-paste(x,y,sep=qaz)
and split them by the 'qaz' in two ways:
  system.time(ret1-list(x=sub(qaz.*,,xy),y=sub(.*qaz,,xy)))
  # user  system elapsed 
  # 0.220.000.21 
 
system.time({tmp-strsplit(xy,qaz);ret2-list(x=unlist(lapply(tmp,`[`,

1)),y=unlist(lapply(tmp,`[`,2)))})
   user  system elapsed 
  # 2.420.002.20 
  identical(ret1,ret2)

  #[1] TRUE
  identical(ret1$x,x)  identical(ret1$y,y)
  #[1] TRUE

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com 

  

--j

--

Jonathan A. Greenberg, PhD
Postdoctoral Scholar
Center for Spatial Technologies and Remote Sensing (CSTARS)
University of California, Davis
One Shields Avenue
The Barn, Room 250N
Davis, CA 95616
Phone: 415-763-5476
AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.




--

Jonathan A. Greenberg, PhD
Postdoctoral Scholar
Center for Spatial Technologies and Remote Sensing (CSTARS)
University of California, Davis
One Shields Avenue
The Barn, Room 250N
Davis, CA 95616
Phone: 415-763-5476
AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] splitting a vector of strings...

2009-10-22 Thread andrew
the following works - double backslash to remove the or
functionality of | in a regex.  (Bill Dunlap showed that you don't
need sapply for it to work)

xs - this is | string
xsv - paste(xs, 1:10)
strsplit(xsv, \\|)


On Oct 23, 3:50 pm, Jonathan Greenberg greenb...@ucdavis.edu wrote:
 William et al:

     Thanks!  I think I have a somewhat more complicated issue due to the
 type of string I'm using -- the split is  |  (space pipe space) -- how
 do I code that based on your sub code below?  Using  | * doesn't seem
 to be working.  Thanks!

 --j



 William Dunlap wrote:
  -Original Message-
  From: r-help-boun...@r-project.org
  [mailto:r-help-boun...@r-project.org] On Behalf Of Jonathan Greenberg
  Sent: Thursday, October 22, 2009 7:35 PM
  To: r-help
  Subject: [R] splitting a vector of strings...

  Quick question -- if I have a vector of strings that I'd like
  to split
  into two new vectors based on a substring that is inside of
  each string,
  what is the most efficient way to do this?  The substring
  that I want to
  split on is multiple characters, if that matters, and it is
  contained in
  every element of the character vector.

  strsplit and sub can both be used for this.  If you know
  the string will be split into 2 parts then 2 calls to sub
  with slightly different patterns will do it.  strsplit requires
  less fiddling with the pattern and is handier when the number
  of parts is variable or large.  strsplit's output often needs to
  be rearranged for convenient use.

  E.g., I made 100,000 strings with a 'qaz' in their middles with
    x-paste(X,sample(1e5),sep=)
    y-sub(X,Y,x)
    xy-paste(x,y,sep=qaz)
  and split them by the 'qaz' in two ways:
    system.time(ret1-list(x=sub(qaz.*,,xy),y=sub(.*qaz,,xy)))
    # user  system elapsed
    # 0.22    0.00    0.21

  system.time({tmp-strsplit(xy,qaz);ret2-list(x=unlist(lapply(tmp,`[`,
  1)),y=unlist(lapply(tmp,`[`,2)))})
     user  system elapsed
    # 2.42    0.00    2.20
    identical(ret1,ret2)
    #[1] TRUE
    identical(ret1$x,x)  identical(ret1$y,y)
    #[1] TRUE

  Bill Dunlap
  Spotfire, TIBCO Software
  wdunlap tibco.com

  --j

  --

  Jonathan A. Greenberg, PhD
  Postdoctoral Scholar
  Center for Spatial Technologies and Remote Sensing (CSTARS)
  University of California, Davis
  One Shields Avenue
  The Barn, Room 250N
  Davis, CA 95616
  Phone: 415-763-5476
  AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307

  __
  r-h...@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.

 --

 Jonathan A. Greenberg, PhD
 Postdoctoral Scholar
 Center for Spatial Technologies and Remote Sensing (CSTARS)
 University of California, Davis
 One Shields Avenue
 The Barn, Room 250N
 Davis, CA 95616
 Phone: 415-763-5476
 AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307

 __
 r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Splitting a vector into equal groups

2009-05-05 Thread Uwe Ligges



utkarshsinghal wrote:

Hi All,

I have vector of length 52, say, x=sample(30,52,replace=T). I want to 
sort x and split into five *nearly equal groups*. Note that the 
observations are repeated in x so in case of a tie I want both the 
observations to fall in same group.
This seems a very common task to do, but still I couldn't find an R 
function to do this. Any help would be highly appreciated.



See ?cut for groups equal in in its range or
?co.intervals in package lattice for intervals somewhat equal in number 
of observations.


Uwe Ligges





Regards
Utkarsh



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Splitting a vector into equal groups

2009-05-04 Thread utkarshsinghal
Hi All,

I have vector of length 52, say, x=sample(30,52,replace=T). I want to 
sort x and split into five *nearly equal groups*. Note that the 
observations are repeated in x so in case of a tie I want both the 
observations to fall in same group.
This seems a very common task to do, but still I couldn't find an R 
function to do this. Any help would be highly appreciated.

Regards
Utkarsh



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Splitting a vector into equal groups

2009-05-04 Thread ronggui
lattice:::equal.count may be what you want.

2009/5/4 utkarshsinghal utkarsh.sing...@global-analytics.com:
 Hi All,

 I have vector of length 52, say, x=sample(30,52,replace=T). I want to
 sort x and split into five *nearly equal groups*. Note that the
 observations are repeated in x so in case of a tie I want both the
 observations to fall in same group.
 This seems a very common task to do, but still I couldn't find an R
 function to do this. Any help would be highly appreciated.

 Regards
 Utkarsh



        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
HUANG Ronggui, Wincent
PhD Candidate
Dept of Public and Social Administration
City University of Hong Kong
Home page: http://asrr.r-forge.r-project.org/rghuang.html

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Splitting a vector into equal groups

2009-05-04 Thread Dimitris Rizopoulos
check functions cut() and quantile(), and cut2() from package Hmisc; 
maybe the following is close to what you want:


x - sample(30, 52, replace = TRUE)

k - 5 # how many groups
qs - quantile(x, seq(0, 1, length.out = k + 1))
y - cut(x, round(qs), include.lowest = TRUE)
y
table(y)


I hope it helps.

Best,
Dimitris


utkarshsinghal wrote:

Hi All,

I have vector of length 52, say, x=sample(30,52,replace=T). I want to 
sort x and split into five *nearly equal groups*. Note that the 
observations are repeated in x so in case of a tie I want both the 
observations to fall in same group.
This seems a very common task to do, but still I couldn't find an R 
function to do this. Any help would be highly appreciated.


Regards
Utkarsh



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Dimitris Rizopoulos
Assistant Professor
Department of Biostatistics
Erasmus University Medical Center

Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
Tel: +31/(0)10/7043478
Fax: +31/(0)10/7043014

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Splitting a vector into equal groups

2009-05-04 Thread Berwin A Turlach
G'day Utkarsh,

On Mon, 04 May 2009 11:51:21 +0530
utkarshsinghal utkarsh.sing...@global-analytics.com wrote:

 I have vector of length 52, say, x=sample(30,52,replace=T). I want to 
 sort x and split into five *nearly equal groups*.

What do you mean by *nearly equal groups*?  The size of the groups
should be nearly equal? The sum of the elements of the groups should be
nearly equal?

 Note that the observations are repeated in x so in case of a tie I
 want both the observations to fall in same group.

Then it becomes even more important to define what you mean with
nearly equal groups.

As a start, you may consider:

R set.seed(1)
R x=sample(30,52,replace=T)
R xrle - rle(sort(x))
R xrle
Run Length Encoding
  lengths: int [1:25] 2 1 2 2 3 1 1 1 5 1 ...
  values : int [1:25] 1 2 4 6 7 8 9 11 12 13 ...
R cumsum(xrle$lengths)
 [1]  2  3  5  7 10 11 12 13 18 19 24 25 26 28 29 32 35 38
[19] 43 45 46 48 49 51 52

and use this to determine our cut-offs.  E.g., should the first group
have 10, 11 or 12 elements in this case?  The information in xrle
should enable you to construct your five groups once you have decided
on a grouping.

HTH.

Cheers,

Berwin

=== Full address =
Berwin A TurlachTel.: +65 6516 4416 (secr)
Dept of Statistics and Applied Probability+65 6516 6650 (self)
Faculty of Science  FAX : +65 6872 3919   
National University of Singapore 
6 Science Drive 2, Blk S16, Level 7  e-mail: sta...@nus.edu.sg
Singapore 117546http://www.stat.nus.edu.sg/~statba

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] splitting time vector into days

2008-09-09 Thread Alexy Khrabrov
Greetings -- I have a dataframe a with one element a vector, time, of  
POSIXct values.  What's a good way to split the data frame into  
periods of a$time, e.g. days, and apply a function, e.g. mean, to some  
other column of the dataframe, e.g. a$value?


Cheers,
Alexy

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] splitting time vector into days

2008-09-09 Thread stephen sefick
?aggregate
?window.zoo
?rollapply

anyway have a look at package zoo

On Tue, Sep 9, 2008 at 3:25 PM, Alexy Khrabrov [EMAIL PROTECTED] wrote:
 Greetings -- I have a dataframe a with one element a vector, time, of
 POSIXct values.  What's a good way to split the data frame into periods of
 a$time, e.g. days, and apply a function, e.g. mean, to some other column of
 the dataframe, e.g. a$value?

 Cheers,
 Alexy

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Stephen Sefick
Research Scientist
Southeastern Natural Sciences Academy

Let's not spend our time and resources thinking about things that are
so little or so large that all they really do for us is puff us up and
make us feel like gods. We are mammals, and have not exhausted the
annoying little problems of being mammals.

-K. Mullis

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] splitting time vector into days

2008-09-09 Thread jim holtman
Here is one way of doing it:

 x - data.frame(dates=seq(as.POSIXct('2008-09-08'), by='7 hours', length=10),
+ values=1:10)
 # split into days
 x.s - split(x, format(x$dates, %Y%m%d))
 x.s
$`20080908`
dates values
1 2008-09-08 00:00:00  1
2 2008-09-08 07:00:00  2
3 2008-09-08 14:00:00  3
4 2008-09-08 21:00:00  4

$`20080909`
dates values
5 2008-09-09 04:00:00  5
6 2008-09-09 11:00:00  6
7 2008-09-09 18:00:00  7

$`20080910`
 dates values
8  2008-09-10 01:00:00  8
9  2008-09-10 08:00:00  9
10 2008-09-10 15:00:00 10

 lapply(x.s, function(.df) mean(.df$values))
$`20080908`
[1] 2.5

$`20080909`
[1] 6

$`20080910`
[1] 9




On Tue, Sep 9, 2008 at 3:25 PM, Alexy Khrabrov [EMAIL PROTECTED] wrote:
 Greetings -- I have a dataframe a with one element a vector, time, of
 POSIXct values.  What's a good way to split the data frame into periods of
 a$time, e.g. days, and apply a function, e.g. mean, to some other column of
 the dataframe, e.g. a$value?

 Cheers,
 Alexy

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] splitting a vector on comma

2008-05-05 Thread Henrique Dallazuanna
Try:

scan(textConnection(u), sep=,)

On Mon, May 5, 2008 at 12:59 AM, Georg Ehret [EMAIL PROTECTED] wrote:

 Dear R Usergroup,
 I have the following vector and I would like to split it on ,.
 How can I do this?
  u
 [1]

 160798191,160802762,160813395,160816017,160817873,160824082,160825247,160826925,160834272,160836257,

 Thank you in advance!
 With my best regards, Georg.
 
 Georg Ehret
 Baltimore
 USA

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] splitting a vector on comma

2008-05-04 Thread Georg Ehret
Dear R Usergroup,
 I have the following vector and I would like to split it on ,.
How can I do this?
 u
[1]
160798191,160802762,160813395,160816017,160817873,160824082,160825247,160826925,160834272,160836257,

Thank you in advance!
With my best regards, Georg.

Georg Ehret
Baltimore
USA

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] splitting a vector on comma

2008-05-04 Thread David Scott


?strsplit


On Sun, 4 May 2008, Georg Ehret wrote:


Dear R Usergroup,
I have the following vector and I would like to split it on ,.
How can I do this?

u

[1]
160798191,160802762,160813395,160816017,160817873,160824082,160825247,160826925,160834272,160836257,

Thank you in advance!
With my best regards, Georg.

Georg Ehret
Baltimore
USA

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



_
David Scott Department of Statistics, Tamaki Campus
The University of Auckland, PB 92019
Auckland 1142,NEW ZEALAND
Phone: +64 9 373 7599 ext 86830 Fax: +64 9 373 7000
Email:  [EMAIL PROTECTED]

Graduate Officer, Department of Statistics
Director of Consulting, Department of Statistics

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.