Arnaud,
Short answer may be that the tibble data structure will not be supporting row
names and you may want to simply save those names in an additional column or
externally.
My first thought was to simply save the names you need and then put them back
on the tibble. In your code, something
Jef, your terse reply was so constructive that you converted me! LOL!
That is an interesting point though that I remain a bit unclear on.
Both data.frame and as.data.frame can be used in some ways similarly as in:
> data.frame(matrix(1:12, nrow=3))
X1 X2 X3 X4
1 1 4 7 10
2 2 5 8 11
3
Борис,
Try this where you tell matrix the column names you want:
nouns <- as.data.frame(
matrix(c(
"gaggle",
"geese",
"dule",
"doves",
"wake",
"vultures"
),
ncol = 2,
byrow = TRUE,
dimnames=list(NULL, c("collective", "category"
Result:
> nouns
Just a minor point in the suggested solution:
df$LAP <- with(df, ifelse(G=='male', (WC-65)*TG, (WC-58)*TG))
since WC and TG are not conditional, would this be a slight improvement?
df$LAP <- with(df, TG*(WC - ifelse(G=='male', 65, 58)))
-Original Message-
From: R-help On Behalf Of
To be fair, Jordan, I think R has some optimizations so that the arguments
in some cases are NOT evaluated until needed. So only one or the other
choice ever gets evaluated for each row. My suggestion merely has
typographic implications and some aspects of clarity and minor amounts of
less memory
There are many techniques Callum and yours is an interesting twist I had not
considered.
Yes, you can specify what integer a factor uses to represent things but not
what I meant. Of course your trick does not work for some other forms of data
like real numbers in double format. There is a
Yes, Bert. At first glance I thought it was one of the merge/joins and then
wondered at the wording that made it sound like the ids may not be one per
column.
IFF the need is the simpler case, it is a straightforward enough and common
need. An example might make it clear enough so actual code
There may be a point to consider about the field containing dates in the
request below. Yes, much code will "work" just fine if the column are is seen
as text as you can group by that too. The results will perhaps not be in the
order by row that you expected but you can do your re-sorting
Paul,
I have snipped away your long message and want to suggest another approach
or way of thinking to consider.
You have received other good suggestions and I likely would have used
something like that, probably within the dplyr/tidyverse but consider
something simpler.
You seem to be viewing
David,
You have choices depending on your situation and plans.
Obviously the ideal solution is to make any CSV you save your EXCEL data in to
have exactly what you want. So if your original EXCEL file contains things like
a blank character down around row 973, get rid of it or else all lines
David,
This may just be the same as your earlier problem. When the type of a column is
guessed by looking at the early entries, any non-numeric entry forces the
entire column to be character.
Suggestion: fix your original EXCEL FILE or edit your CSV to remove the last
entries that look just
Charity,
As some of the answers I have seen show, your question is not clear.
You need to be clear on what you mean about R software and other concepts
before an answer makes sense.
The Base version of R may come on your computer already but likely has been
installed from some external source,
Rui,
The problem with searching for elements, as with many kinds of text, is that
the optimal search order may depend on the probabilities of what is involved.
There can be more elements added such as Unobtainium in the future with
whatever abbreviations that may then change the algorithm you
Leonard,
Since it now seems a main consideration you have is speed/efficiency, maybe a
step back might help.
Are there simplifying assumptions that are valid or can you make it simpler,
such as converting everything to the same case?
Your sample data was this and I assume your actual data is
This discussion is sooo familiar.
If you want indefinite precision arithmetic, feel free to use a language and
data type that supports it.
Otherwise, only do calculations that fit in a safe zone.
This is not just about this scenario. Floating point can work well when adding
(or subtracting)
To be clear, I take no credit for the rather extraordinary function cll shown
below:
mutate(Date = lubridate::dmy_hm(Date))
I would pretty much never have constructed such an interesting and rather
unnecessary line of code.
ALL the work is done within the parentheses:
Date =
If I follow this thread, it looks clear that the problem is superficial and not
really about the c() function as it is below sea level.
Is this also a problem if you replace c() with max () or list() as I think it
may be? Then it is more about what length the interpreter is able to handle
Eberhard,
Others have supplied ways to do this using various date management functions.
But I want to add another option that may make sense if the dates are not all
quite as predictable.
You can use your own regular expressions in R as in most languages, that try to
match each entry to one
Tim,
Your reply is reasonable if you want to read in EVERYTHING and use various
nice features of the select() function in the dplyr package of the tidyverse
that let you exclude a bunch of columns based on names starting or ending or
containing various characters or not being of type integer and
To be clear, Everything has limits beyond which it is not expected to have to
deal with. Buffers often pick a fixed size and often need complex code to keep
grabbing a bigger size and copy and add more, or arrange various methods to
link multiple memory areas into a growing whole, such as
Ranjeet,
As others have said, you have not shown enough to get decent answers.
What you describe sounds quite routine and is the first step many have to do
when gathering data from disparate sources that were done by different people
without much consideration it has to follow some specific
Javad,
If I understood you, you want to use one of many methods to GROUP BY one
column and take the minimum within each group.
If your data is set up right, perhaps using factors, there are base R
versions but many would also suggest using dplyr/tidyverse methods such as
piping your data to
I read all the replies and am not sure why nobody used what I see as simpler
and more direct.
Assuming the ORDER of the output matters, it tends to be controlled by the
order of the factor called Code so I have simple code like this:
---
# Load required libraries
library(dplyr)
# Simulate
Yes, Timothy, the request was not seen by all of us as the same.
Indeed if the request was to show a subset of the original data consisting
of only the rows that were the minimum for each Code and also showed ties,
then the solution is a tad more complex. I would then do something along the
lines
The requirements keep being clarified and it would have been very useful to
know more in advance.
To be clear. My earlier suggestion was based on JUST wanting the minimum for
each unique version of Code. Then you wanted it in the original order so that
was handled by carefully making that a
David,
As others have said, there are many possible answers for a vague enough
question.
For one-time data it is often easiest to simply change the data source as you
say you did in EXCEL.
Deleting the 18th row can easily be done in R and might make sense if you get
daily data and decided
Adding to what Nick said, extra lines like those described often are in some
comment format like beginning with "#" or some consistent characters that can
be filtered out using comment.char='#' for example in read.csv() or
comment="string" in the tidyverse function read_csv().
And, of course
Javad,
After reading the exchanges, I conclude you are asking a somewhat different
question than some of us expected and I see some have zoomed in on what you
seem to want.
You seem to want to make a very focused change and save the results to be as
identical as what you started with. You
Has anyone noticed something a tad unusual?
Someone shows up and seemingly politely asks a totally open-ended question and
supplies NO DETTAILS about their personal status and experience that would be
needed to tell hem whether it would take various amounts of time for him to
learn enough R
I am not replying to the earlier request just to the part right below my
message.
A simple suggestion when sending people code is to add NOTHING except proper
comments.
Can we assume the extra asterisks are superfluous and not in your code?
I mean your column is named "Period" and not "*Period"
Tim and others,
A point to consider is that there are various algorithms in the functions
used to read in formatted data into data.frame form and they vary. Some do a
look-ahead of some size to determine things and if they find a column that
LOOKS LIKE all integers for say the first thousand
Another exceedingly polite questioner. Cultural differences!
I think we can skip discussing if we are doing well, and get to the point.
To start with, I got thrown by these two lines:
a=rnorm(1000, 110, 5)
s = length(a)
This does not relate to the difficulty, but is a sort of sloppy use as a
Documentation specifics aside, and I am not convinced that is an issue here,
there is a responsibility on programmers on how to use routines like this by
testing small samples and seeing if the results match expectations.
Since negative numbers were possible, that would have been part of such
Fair enough, Akshay. Wondering why a design was chosen is reasonable.
There are languages like python that allow unpacking multiple values and it
is not uncommon to return multiple things from some constructs as in this:
>>> a,b,c = { 4, 5, 6 }
>>> a
4
>>> b
5
>>> c
6
But that is
Akshay,
Your question seems a tad mysterious to me as you are complaining about
NOTHING.
R was designed to return single values. The last statement executed in a
function body, for example, is the value returned even when not at the end.
Scoping is another issue entirely. What is visible is
Yes, not every use of a word has the same meaning. The UNIX pipe was in many
ways a very different animal where the PIPE was a very real thing and looked
like a sort of temporary file in the file system with special properties.
Basically it was a fixed-size buffer that effectively was written into
Tim,
There are differences and this one can be huge.
The other pipe operators let you pass the current object to a later argument
instead of the first by using a period to represent where to put it. The new
one has a harder albeit flexible method by creating an anonymous function.
-Original
John,
The topic has indeed been discussed here endlessly but new people still
stumble upon it.
Until recently, the formal R language did not have a built-in pipe
functionality. It was widely used through an assortment of packages and
there are quite a few variations on the theme including
Boris,
There are MANY variations possible and yours does not seem that common or
useful albeit perfectly useful.
I am not talking about making it a one-liner, albeit I find the multi-line
version more useful.
The pipeline concept seems sort of atomic in the following sense. R allows
several
Evan, there are oodles of ways to do many things in R, and mcu of what the
tidyverse supplies can often be done as easily, or easier, outside it.
Before presenting a solution, I need to make sure I am answering the same
question or problem you intend.
Here is the string you have as an example:
Boris,
What you are telling us is not particularly new or spectacular in a sense.
It has often been hard to grade assignments students do when they choose an
unexpected path. I had one instructor who always graded my exams (in the
multiple courses I took with him) because unlike most of the
This may be a fairly dumb and often asked question about some functions like
strsplit() that return a list of things, often a list of ONE thing that be
another list or a vector and needs to be made into something simpler..
The examples shown below have used various methods to convert the
Kai,
As Bert pointed out, it may not be clear what you want.
As a GUESS, you have some arbitrary data.frame object with multiple columns and
you want to do something on selected columns. Consider changing your idea to be
in several stages for simplicity and then optionally later rewriting it.
Kai,
I have read all the messages exchanged so far and what I have not yet seen is a
clear explanation of what you want to do. I mean not as R code that may have
mistakes, but as what your goal is.
Your code below was a gigantic set of nested if statements that is not trivial
to parse.
Steven,
Just want to add a few things to what people wrote.
In base R, the methods mentioned will let you make a copy of your original DF
that is missing the items you are selecting that match your pattern.
That is fine.
For some purposes, you want to keep the original data.frame and remove a
John,
I am very familiar with the evolving tidyverse and some messages a while back
included people who wanted this forum to mainly stick to base R, so I leave out
examples.
Indeed, the tidyverse is designed to make it easy to select columns with all
kinds of conditions including using
Valentin,
You are correct that R does many things largely behind the scenes that make
some operations fairly efficient.
>From a programming point of view, though, many people might make a data.frame
>and not think of it as a list of vectors of the same length that are kept that
>way.
So if
Just an idea if this is a one-time need to copy static data once used in
non-R to R. You are a bit vague about what you mean by "objects."
If you can find someone who uses S or S+ then maybe they can load the data
in and export it in some format usable for you and send you those files. If,
for
Kai,
Just FYI, this is mainly an R mailing list and although there ware ways to
combine python with R (or sort of alone) within environments like RSTUDIO, this
may not be an optimal place to discuss this. You are discussing what is no
longer really "R markdown" and more just plain "markdown"
John,
As you said, you are new to the discussion so let me catch you up.
The original question was about removing many columns that shared a similar
feature in the naming convention while leaving other columns in-place. Quite a
few replies were given on how to do that including how to use a
Again, John, we are comparing different designs in languages that are often
decades old and partially retrofitted selectively over the years.
Is it poor form to use global variables? Many think so. Discussions have
been had on how to use variables hidden in various ways that are not global,
such
Richard,
I appreciate your observations. As regularly noted, there are many possible
forks in the road to designing a language and it seems someone is determined
to try every possible fork.
Yes, some languages that are compiled, or like JavaScript, read the entire
function before executing it
I see many are not thrilled with the concise but unintuitive way it is
suggested you use with the new R pipe function.
I am wondering if any has created one of a family of functions that might be
more intuitive if less general.
Some existing pipes simply allowed you to specify where in an
Sometimes you need to NOT use a regular expression and do things simpler. You
have a fairly simple example that not only does not need great power but may be
a pain to do using a very powerful technique, especially if you want to play
with look-ahead and look behind.
Assuming you have a line
Try:
print(data.frame(COL1=1:5, COL2=10:6), row.names=FALSE)
-Original Message-
From: R-help On Behalf Of Dennis Fisher
Sent: Monday, March 27, 2023 1:05 PM
To: r-help@r-project.org
Subject: [R] printing a data.frame without row numbers
R 4.2.3
OS X
Colleagues,
I am printing a
I may be missing something but using the plain old c() combine function
seems to work fine:
df <- data.frame(left = 1:5, right = 6:10)
df.combined <- data.frame(comb = c(df$left, df$right))
df
left right
11 6
22 7
33 8
44 9
5510
df.combined
comb
1
The example given does not leave room for even a single copy of your matrix
so, yes, you need alternatives.
Your example was fairly trivial as all you wanted to do is subtract each
value from 100 and replace it. Obviously something like squaring a matrix
has no trivial way to do without multiple
Your spelling of:
HH size
Is two word.
-Original Message-
From: R-help On Behalf Of Nandini raj
Sent: Monday, March 20, 2023 1:17 PM
To: r-help@r-project.org
Subject: [R] DOUBT
Respected sir/madam
can you please suggest what is an unexpected symbol in the below code for
running a
A major question is why you ask how to use the subset function rather than
asking how to get your job done.
As you note, the simple way to get the first N items is to use indexing. If you
absolutely positively insist on using subset, place your data into something
like a data.frame and add a
In reading the post again, it sounds like the question is how to create a
logical condition that translates as 1:N is TRUE. Someone hinted along those
lines.
So one WAY I might suggest is you construct a logical vector as shown below. I
give an example of a bunch of 9 primes and you want only
Naresh,
This is a common case where the answer to a question is to ask the right
question.
Your question was how to make apply work. My question is how can you get the
functionality you want done in some version of R.
Apply is a tool and it is only one of many tools and may be the wrong
Jorgen is correct that for many purposes, viewing a data.frame as a
collection of vectors of the same length allows you to code fairly complex
logic using whichever vectors you want and result in a vector answer, either
externally or as a new column. Text columns used to make some decisions in
the
Steven,
The default is drop=TRUE.
If you want to retain a data.frame and not have it reduced to a vector under
some circumstances.
https://win-vector.com/2018/02/27/r-tip-use-drop-false-with-data-frames/
-Original Message-
From: R-help On Behalf Of Steven T. Yen
Sent: Sunday,
Richard, it is indeed possible for different languages to choose different
approaches.
If your point is that an R named list can simulate a Python dictionary (or for
that manner, a set) there is some validity to that. You can also use
environments similarly.
Arguably there are differences
Javad,
There may be nothing wrong with the methods people are showing you and if it
satisfied you, great.
But I note you have lots of data in over a quarter million rows. If much of the
text data is redundant, and you want to simplify some operations such as
changing some of the values to
John,
I am a tad puzzled at why your code does not work so I tried replicating it.
Let me say you are not plotting what you think. When you plot points using
characters, it LOOKS like it did something but not really. It labels four
equally apart lines (when your data is not linear) and you are
Anupam,
Your question, even after looking at other messages, remains a bit unclear.
What do you mean by "labels"? What you mean by variables and values and how
is that related to factors?
An example or two would be helpful so we can say more than PROBABLY.
Otherwise, you risk having many people
Anupam,
Thanks for explaining you are talking about factors.
I see my friend Adrian has pointed out reasons you may want to use a package he
built called “declared” but my answer will be within the regular R domain as
you asked.
You should read up a bit on factors in a book, not just
Just to bring it back to R, I want to point out that what many R programmers do
is not that different. If you develop some skills at analyzing some kinds of
data and have a sort of toolchest based on past work, then a new project along
similar lines may move very quickly. After a while, you
Jim,
I am not sure what your example means but text to image conversion can be
done quite easily in many programming environments and does not need an AI
unless you are using it to hunt for info. I mean you can open up many Paint
or Photo programs and look at the menus and often one allows you
Hadley,
Thanks and I know many such things exist. I simply found it interesting that
what was mentioned seemed simpler as just being a converter of text to make a
bitmap type image. Now if I want a simulated image of a cat riding a motorcycle
while holding an Esperanto Flag, sure, I would not
Interesting to read all the answers. Personally, I was a bit irked to see
that using a combination of assignments using rownames() and colnames() did
not work as one canceled what the other had done.
But it turns out if we listed to what John really wanted versus what he said
he wanted, then a
Evan,
List names are less easy than data.frame column names so try this:
> test <- list(a=3,b=5,c=11)
> colnames(test)
NULL
> colnames(as.data.frame(test))
[1] "a" "b" "c"
But note an entry with no name has one made up for it.
> test2 <- list(a=3,b=5, 666, c=11)
> colnames(data.frame(test2))
All true Jeff, but why do things the easy way! LOL!
My point was that various data structures, besides the list we started with,
store the names as an attribute. Yes, names(listname) works fine to extract
whatever parts they want. My original idea of using a data.frame was because
it creates
Evan,
Yes, once you know a bit about the details, all kinds of functions are
available to solve problems without going the hard way.
But the names() function is taught fairly widely and did you also pick up on
the fact that it can be used on both sides so it also sets the names?
> # Create a
Jeff R, it would be helpful if your intent was understood.
For example, did you want output as a column of labels c("A", "B", "C") and
another adjacent of c(0.0011566127, 0.0009267028, 0.0081623324) then you
could do:
data.frame(labels=c("A", "B", "C"), data=c(0.0011566127, 0.0009267028,
Jeff,
The number of items is not relevant except insofar as your vector of
probabilities is in the same order as the other vector and the same length.
If for example you had a vector of test scores for 10,000 tests and you
calculated the probability in the data of having a 100, then the
Jeff,
I wish I could give you an answer to a very specific question.
You have lots of numbers in a vector representing whatever "probabilities"
mean something to you. There are currently no names associated with them.
And you want to make some kind of graph using ggplot.
So, to be quite clear,
The problem being discussed is really a common operation that R handles
quite easily in many ways.
The code shown has way too many things that do not fit to make much sense
and is not written the way many R programmers would write it.
Loops like the one used are legal but not needed.
As has
[See the end for an interesting twist on moving a column to row.names.]
Yes, many ways to do things exist but it may make sense to ask for what the
user/OP really wants. Sometimes the effort to make a brief example obscures
things.
Was there actually any need to read in a file containing
Val,
A data.frame is not quite the same thing as a matrix.
But as long as everything is numeric, you can convert both data.frames to
matrices, perform the computations needed and, if you want, convert it back
into a data.frame.
BUT it must be all numeric and you violate that requirement by
Steve,
As Iris pointed out, some implementations of a matrix are actually of a vector
with special qualities. There are sometimes choices whether to store it a row
at a time or a column at a time.
In R, your data consisted of the integers from 1 to 20 and they clearly are
stored a column at a
Eric,
I am not sure your solution is particularly economical albeit it works for
arbitrary arrays of any dimension, presumably. But it seems to involve
converting a matrix to a tensor just to undo it back to a vector. Other
solutions offered here, simply manipulate the dim attribute of the
Based on a private communication, it sounds like Steven is asking the question
again because he wants a different solution that may be the way this might be
done in another language. I think he wants to use loops explicitly and I
suspect this may be along the lines of a homework problem for
Eric,
I fully agreed with you that anyone doing serious work in various projects such
as machine learning that make heavy use of mathematical data structures would
do well to find some decent well designed and possibly efficient packages to do
much of the work rather than re-inventing their
This topic is getting almost funny as there are an indefinite ever-sillier
set of ways to perform the action and even more if you include packages like
purr.
If mymat is a matrix, several variants work such as:
> mymat
[,1] [,2] [,3] [,4]
[1,]147 10
[2,]258 11
I was rushing out Phil so let me amend what I wrote. As others noted, this is
fairly beginner stuff. If you have more such questions, besides reading up,
please consider sending questions to the Tutor mailing list where there is more
patience.
You wanted to change selected small values to
Bert,
I stand corrected. What I said may have once been true but apparently the
implementation seems to have changed at some level.
I did not factor that in.
Nevertheless, whether you use an index as a key or as an offset into an
attached vector of labels, it seems to work the same and I
Chris,
Consider breaking up your task into multiple passes.
And do them in whatever order preserves what you need.
First, are you talking about brackets as in square brackets, or as in your
example, parentheses?
If you are sure you have no nested brackets, your requirement seems to be that
Phil,
What have you tried. This seems straightforward enough.
Could you clarify what you mean by NULL?
In R, it is common to use NA or a more specific version of it.
So assuming you have two vectors containing floats with some NA, then:
C <- A*B
Will give you the products one at a time if
Leonard,
It can be helpful to spell out your intent in English or some of us have to go
back to the documentation to remember what some of the operators do.
Your text being searched seems to be an example of items between comas with an
optional space after some commas and in one case, nothing
Steven,
It depends what you want to do. What you are showing seems to replace the
values stored in "data" each time.
Many kinds of loops will do that, with one simple way being to store all the
filenames in a list and loop on the contents of the list as arguments to
read.csv.
Since you show
Emily,
I too copied/pasted your code in and it worked fine. I then asked for the
function definition and got it.
Did you put the entire text in? I mean nothing extra above or below except
maybe whitespace or comments?
What sometimes happens to make the code incomplete is to leave out a
matching
Nick, obviously figuring out the problem is best but you may want to deal
with the symptom.
RSTUDIO lets you adjust the sizes of the various windows and enlarging the
window (lower right normally) where the graph is shown may be a first
attempt if the problem is display space.
And note RSTUDIO
Having read all of the replies, it seems there are solutions for the
question and the OP points out that some solutions such as making the
document twice will affect the creation date.
I suspect the additional time to do so is seconds or at most minutes so it
may not be a big deal.
But what
This may be a dumb question and the answer may make me feel dumber.
I have had trouble for years with R packages wanting Rtools on my machine
and not being able to use it. Many packages are fine as binaries are
available. I have loaded Rtools and probably need to change my PATH or
something.
With all this discussion, I shudder to ask this. I may have missed the
answers but the discussion seems to have been about identifying and solving
the problem rapidly rather than what maybe is best going forward if all
parties agree.
What was the motivation for what RSTUDIO did for their version
Thank you Duncan, you explained quite a bit.
I am unclear how this change causes the problem the OP mentioned.
It is an example of people using a clever trick to get what they think they
want that could be avoided if the original program provided a hook. Of
course the hook could be used more
Chris, since it does indeed look like homework, albeit a deeper looks
suggests it may not beI think we can safely answer the question:
>Is there any way to write codes to do this in R?
The answer is YES.
And before you ask, it can be done in Python, Java, C++, Javascript, BASIC,
FORTRAN and
John,
Your reaction was what my original reaction was until I realized I had to
find out what a DEM file was and that contains enough of the kind of
depth-dimension data you describe albeit what may be a very irregular cross
section to calculate for areas and thence volumes.
If I read it
1 - 100 of 109 matches
Mail list logo