Re: [R] Plotting the ASCII character set.
Sent from my iPhone > On Jul 3, 2021, at 7:00 PM, Rolf Turner wrote: > > >> On Sat, 3 Jul 2021 09:40:28 +0200 >> Ivan Krylov wrote: >> >> Hello Rolf Turner, >> >> On Sat, 3 Jul 2021 14:02:59 +1200 >> Rolf Turner wrote: >> >>> Can anyone suggest how I might get my plot_ascii() function working >>> again? Basically, it seems to me, the question is: how do I >>> persuade R to read in "\260" as "\ub0" rather than "\xb0"? >> >> Part of the problem is that the "\xb0" byte is not in ASCII, which >> covers only the lower half of possible 8-bit bytes. I guess that the >> strings containing bytes with highest bit set used to be interpreted >> as Latin-1 on your machine, but now get interpreted as UTF-8, which >> changes their meaning (in UTF-8, the highest bit being set indicates >> that there will be more bytes to follow, making the string invalid if >> there is none). >> >> The good news is, since it's Latin-1, which is natively supported by >> R, there are even multiple options: >> >> 1. Mark the string as Latin-1 by setting Encoding(a) <- 'latin1' and >> let R do the re-encoding if and when Pango asks it for a UTF-8-encoded >> string. >> >> 2. Decode Latin-1 into the locale encoding by using iconv(a, 'latin1', >> '') (or set the third parameter to 'UTF-8', which would give almost >> the same result on a machine with a UTF-8 locale). The result is, >> again, a string where Encoding(a) matches the truth. Explicitly >> setting UTF-8 may be preferable on Windows machines running pre-UCRT >> builds of R where the locale encoding may not contain all Latin-1 >> characters, but that's not a problem for you, as far as I know. >> >> For any encoding other than Latin-1 or UTF-8, option (2) is still >> valid. >> >> I have verified that your example works on my GNU/Linux system with a >> UTF-8 locale if I use either option. > > Thanks Ivan. That solves most of the problem, but there are still > glitches. I get a plot OK, but a substantial number of the characters > are displayed as a wee rectangle containing a 2 x 2 array of digits > such as > >> 0 0 >> 8 0 > > Also note that there is a bit of difference between the results of using > Encoding() and the results of using iconv(). E.g. if I do > > a <- "\x80" > b <- iconv(a,"latin1","UTF-8") > Encoding(a) <- "latin1" > > then when I type "a" I get the Euro symbol "€", but when I type "b" > I get the string "\u0080". > > But that doesn't really matter. More problematic is the fact that if I > do either > >plot(0,0,type="n",xlim=c(0,1),ylim=c(0,1),ann=FALSE,axes=FALSE) >text(0.5,0.5,labels=a,cex=6) > or > >plot(0,0,type="n",xlim=c(0,1),ylim=c(0,1),ann=FALSE,axes=FALSE) >text(0.5,0.5,labels=b,cex=6) > > then I get wee rectangle with 0 0 8 0 arranged in a 2 x 2 array inside. > (Setting cex=6 makes it easier for my ageing eyes to see what the > mAxdigits are.) > > E Is hethere any way that I can get the Euro symbol to display correctly in > such a graphic? > Pick a font that is supported on your OS that has the desired glyph. Also look at the examples in: ?points — David > Thanks. > > cheers, > > Rolf > > -- > Honorary Research Fellow > Department of Statistics > University of Auckland > Phone: +64-9-373-7599 ext. 88276 > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plotting the ASCII character set.
On Sat, 3 Jul 2021 09:40:28 +0200 Ivan Krylov wrote: > Hello Rolf Turner, > > On Sat, 3 Jul 2021 14:02:59 +1200 > Rolf Turner wrote: > > > Can anyone suggest how I might get my plot_ascii() function working > > again? Basically, it seems to me, the question is: how do I > > persuade R to read in "\260" as "\ub0" rather than "\xb0"? > > Part of the problem is that the "\xb0" byte is not in ASCII, which > covers only the lower half of possible 8-bit bytes. I guess that the > strings containing bytes with highest bit set used to be interpreted > as Latin-1 on your machine, but now get interpreted as UTF-8, which > changes their meaning (in UTF-8, the highest bit being set indicates > that there will be more bytes to follow, making the string invalid if > there is none). > > The good news is, since it's Latin-1, which is natively supported by > R, there are even multiple options: > > 1. Mark the string as Latin-1 by setting Encoding(a) <- 'latin1' and > let R do the re-encoding if and when Pango asks it for a UTF-8-encoded > string. > > 2. Decode Latin-1 into the locale encoding by using iconv(a, 'latin1', > '') (or set the third parameter to 'UTF-8', which would give almost > the same result on a machine with a UTF-8 locale). The result is, > again, a string where Encoding(a) matches the truth. Explicitly > setting UTF-8 may be preferable on Windows machines running pre-UCRT > builds of R where the locale encoding may not contain all Latin-1 > characters, but that's not a problem for you, as far as I know. > > For any encoding other than Latin-1 or UTF-8, option (2) is still > valid. > > I have verified that your example works on my GNU/Linux system with a > UTF-8 locale if I use either option. Thanks Ivan. That solves most of the problem, but there are still glitches. I get a plot OK, but a substantial number of the characters are displayed as a wee rectangle containing a 2 x 2 array of digits such as > 0 0 > 8 0 Also note that there is a bit of difference between the results of using Encoding() and the results of using iconv(). E.g. if I do a <- "\x80" b <- iconv(a,"latin1","UTF-8") Encoding(a) <- "latin1" then when I type "a" I get the Euro symbol "€", but when I type "b" I get the string "\u0080". But that doesn't really matter. More problematic is the fact that if I do either plot(0,0,type="n",xlim=c(0,1),ylim=c(0,1),ann=FALSE,axes=FALSE) text(0.5,0.5,labels=a,cex=6) or plot(0,0,type="n",xlim=c(0,1),ylim=c(0,1),ann=FALSE,axes=FALSE) text(0.5,0.5,labels=b,cex=6) then I get wee rectangle with 0 0 8 0 arranged in a 2 x 2 array inside. (Setting cex=6 makes it easier for my ageing eyes to see what the digits are.) Is there any way that I can get the Euro symbol to display correctly in such a graphic? Thanks. cheers, Rolf -- Honorary Research Fellow Department of Statistics University of Auckland Phone: +64-9-373-7599 ext. 88276 __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] concatenating columns in data.frame
Again thanks for carrying on this thread with your additional, informative comments, as well as the welcome humor. On 7/3/21 2:59 AM, Jeff Newmiller wrote: I am very agnostic about tidyverse/base R. However, the complexity of setting up NSE functions is often simply not needed, and I encounter so many people who simply disregard base R as being too outdated so that they never learn how simple solutions in R can be. The contrast between your solution and Bert's was... perhaps informative, but a nuclear bomb where an axe was sufficient. On Fri, 2 Jul 2021, Avi Gross via R-help wrote: I know what you mean Jeff. Yes I am very familiar with base R techniques. What I had hoped for was to do two things that some of the other methods mentioned do that ended up bringing two data.frames together as part of the solution. Much of what I used is now standard R. I was looking at the accessory functions now commonly used in dplyr that let you dynamically select which columns to work with like begins_with() to choose. Sadly, they seem to work on a top-level but not easily within a call to something like paste(...) where they are not evaluated in the way I want. But the odd method I tried can also be used in standard R with a bit of work. You can create a function without using dplyr that takes your df and uses it to concatenate and end with something like: df$new_col <- do_something(df, selected_cols) That too adds a column without the need to merge larger structures explicitly.. But your other point is a tad religious in a sense. I happen to prefer learning a core language first then looking at enhancement opportunities. But at some point, if teaching someone new who wants to focus on getting a job done simply but not necessarily repeatedly or in some ideal way, it is best to do things in a way that their mind flows better. Many things in the tidyverse are redundant with base R or just "fix" inconsistencies like making sure the first argument is always the same. But many add substantially to doing things in a more step-by-step manner. I do not worship the base language as it first came out or even as it has evolved. I do like to know what choices I have and pick and choose among them as needed. Of course a forum like this is more about base R than otherwise and I acknowledge that. Still, the ":=" operator is now base R. There is a new pipeline operator "|>" in base R. Some ideas, good or otherwise, do get in eventually. I started doing graphs using base R as in the plot() command. It was adequate but I wanted better. So I learned about Lattice and various packages and eventually ggplot. I can now do things I barely imagined before and am still learning that there is much more I can do with packages underneath much of the magic and also additional packages layered above it, in some sense. So I do not approach that with an either-or mentality either. Note I am not really talking about just R. I have similar issues with other languages I program in such as Python. None of them were created fully-formed and many had to add huge amounts to adapt to additional wants and needs. Base R for me is often inadequate. But so what? The task being asked for in this thread in isolation, indeed may not be done any better using packages. However, if it is part of a larger set of tasks that can be pipelined, it may well be and I personally was wondering if there was a way in dplyr. There probably is a much better way than I assembled if I only knew about it, and if not, they may add this kind of indirection in a future release if deemed worthy of doing. I have gone back to programs I did years ago with humungous amounts of code using what I knew then and reducing it drastically now that I can tell a function to select say all my column names that end in .orig and apply a set of functions to them with output going to the base name followed by .mean and .sd and so on. All that can often be done in one or two lines of code where previously I had to do 18 near repetitions of each part and then another and another. That used a limited form of dynamism. Be that as it may I think the requester has enough info and we can move on. -Original Message- From: Jeff Newmiller Sent: Friday, July 2, 2021 1:03 AM To: Avi Gross ; Avi Gross via R-help ; R-help@r-project.org Subject: Re: [R] concatenating columns in data.frame I use parts of the tidyverse frequently, but this post is the best argument I can imagine for learning base R techniques. On July 1, 2021 8:41:06 PM PDT, Avi Gross via R-help wrote: Micha, Others have provided ways in standard R so I will contribute a somewhat odd solution using the dplyr and related packages in the tidyverse including a sample data.frame/tibble I made. It requires newer versions of R and other packages as it uses some fairly esoteric features including "the big bang" and the new ":=" operator and more. You can use you
Re: [R] Plotting the ASCII character set.
Hello Rolf Turner, On Sat, 3 Jul 2021 14:02:59 +1200 Rolf Turner wrote: > Can anyone suggest how I might get my plot_ascii() function working > again? Basically, it seems to me, the question is: how do I persuade > R to read in "\260" as "\ub0" rather than "\xb0"? Part of the problem is that the "\xb0" byte is not in ASCII, which covers only the lower half of possible 8-bit bytes. I guess that the strings containing bytes with highest bit set used to be interpreted as Latin-1 on your machine, but now get interpreted as UTF-8, which changes their meaning (in UTF-8, the highest bit being set indicates that there will be more bytes to follow, making the string invalid if there is none). The good news is, since it's Latin-1, which is natively supported by R, there are even multiple options: 1. Mark the string as Latin-1 by setting Encoding(a) <- 'latin1' and let R do the re-encoding if and when Pango asks it for a UTF-8-encoded string. 2. Decode Latin-1 into the locale encoding by using iconv(a, 'latin1', '') (or set the third parameter to 'UTF-8', which would give almost the same result on a machine with a UTF-8 locale). The result is, again, a string where Encoding(a) matches the truth. Explicitly setting UTF-8 may be preferable on Windows machines running pre-UCRT builds of R where the locale encoding may not contain all Latin-1 characters, but that's not a problem for you, as far as I know. For any encoding other than Latin-1 or UTF-8, option (2) is still valid. I have verified that your example works on my GNU/Linux system with a UTF-8 locale if I use either option. -- Best regards, Ivan __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] add a variable a data frame to sequentially count unique rows
Hello, Either I'm not understanding or isn't this just any of aggregate(count ~ ., data = test, FUN = length) test %>% count(group1, group2, name = "Count") ? Hope this helps, Rui Barradas Às 23:27 de 02/07/21, Yuan Chun Ding escreveu: Hi R users, In this test file, test <- data.frame(group1=c("g1", "g1", "g1", "g2", "g2", "g2", "g2", "g2", "g2"), group2=c("k1", "a2", "a2", "c5", "n6", "n6", "n6", "m10","m10"), count= c( 1, 1,2, 1, 2, 2, 2,3,3 )); I have group 1 and group2 variable and want to add the count variable to sequentially count unique rows defined by group1 and group2. I hope to use the following functions in library (tidyverse), No one worked well. test %>% group_by(group1, group2) %>% mutate(count = row_number()) test %>% group_by(group1, group2) %>% mutate(count = 1:n()) test %>% group_by(group1, group2) %>% mutate(count = seq_len(n())) test %>% group_by(group1, group2) %>% mutate(count = seq_along(group1, group2)) Can you help me to make the third column in the test data frame? Thank you, Ding -- -SECURITY/CONFIDENTIALITY WARNING- This message and any attachments are intended solely for the individual or entity to which they are addressed. This communication may contain information that is privileged, confidential, or exempt from disclosure under applicable law (e.g., personal health information, research data, financial information). Because this e-mail has been sent without encryption, individuals other than the intended recipient may be able to view the information, forward it to others or tamper with the information without the knowledge or consent of the sender. If you are not the intended recipient, or the employee or person responsible for delivering the message to the intended recipient, any dissemination, distribution or copying of the communication is strictly prohibited. If you received the communication in error, please notify the sender immediately by replying to this message and deleting the message and any accompanying files from your system. If, due to the security risks, you do not wish to receive further communications via e-mail, please reply to this message and inform the sender that you do not wish to receive further e-mail from the sender. (LCP301) __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.