Re: [R] RWeka cross-validation and Weka_control Parametrization
On Wed, 01 Aug 2007 10:52:02 +0200, Bjoern wrote: Hello, I have two questions concerning the RWeka package: 1.) First question: How can one perform a cross validation, -say 10fold- for a given data set and given model ? 2.) Second question What is the correct syntax for the parametrization of e.g. Kernel classifiers interface m1 - SMO(Species ~ ., data = iris, control = Weka_control(K=weka.classifiers.functions.supportVector.RBFKernel,G=0.1)) m2 - SMO(Species ~ ., data = iris, control = Weka_control(K=weka.classifiers.functions.supportVector.RBFKernel,G=1.0)) m1 SMO Kernel used: RBF kernel: K(x,y) = e^-(0.01* x-y,x-y^2) ## should be: RBF kernel: K(x,y) = e^-(0.1* x-y,x-y^2) etc. The answer for question 2 is surprisingly simple, but nevertheless took me about half an hour to find: m2 - SMO(Species ~ ., data = iris, control = Weka_control(K = weka.classifiers.functions.supportVector.RBFKernel -G 2)) gives R m2 SMO Kernel used: RBF kernel: K(x,y) = e^-(2.0* x-y,x-y^2) [Using Weka_control(K = ..., G = ...) passes the G option to SMO but not RBFKernel. The docs for SMO() say -K classname and parameters The Kernel to use. (default: weka.classifiers.functions.supportVector.PolyKernel) and one needs to remember Weka's command line style interface to realize that this deparses into putting everything into a string for the K option.] This is of course not quite what R users would expect, and we'll try to improve the Weka control mechanism so that specifying (Weka class) options which require additional parameters becomes more convenient. Best -k __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] get.hist.quote problem yahoo
Rene Braeckman writes: I had the same problem some time ago. Below is a function that I picked up on the web somewhere (can't remember where; may have been a newsletter). It's based on the tseries function but the difference is that this function produces a data frame with a column containing the dates of the quotes, instead of a time series object. I had to replace %d-%b-%y by %Y-%m-%d to make it work, probably as you stated because the format was changed by Yahoo. This issue should be taken care of now by a new release of tseries I put out two days ago. -k Hope this helps. Rene # -- # df.get.hist.quote() function # # Based on code by A. Trapletti (package tseries) # # The main difference is that this function produces a data frame with # a column containing the dates of the quotes, instead of a time series # object. df.get.hist.quote - function (instrument = ibm, start, end, quote = c(Open,High, Low, Close,Volume), provider = yahoo, method = auto) { if (missing(start)) start - 1970-01-02 if (missing(end)) end - format(Sys.time() - 86400, %Y-%m-%d) provider - match.arg(provider) start - as.POSIXct(start, tz = GMT) end - as.POSIXct(end, tz = GMT) if (provider == yahoo) { url - paste(http://chart.yahoo.com/table.csv?s=;, instrument, format(start, a=%mb=%dc=%Y), format(end, d=%me=%df=%Y), g=dq=qy=0z=, instrument, x=.csv, sep = ) destfile - tempfile() status - download.file(url, destfile, method = method) if (status != 0) { unlink(destfile) stop(paste(download error, status, status)) } status - scan(destfile, , n = 1, sep = \n, quiet = TRUE) if (substring(status, 1, 2) == No) { unlink(destfile) stop(paste(No data available for, instrument)) } x - read.table(destfile, header = TRUE, sep = ,) unlink(destfile) nser - pmatch(quote, names(x)) if (any(is.na(nser))) stop(This quote is not available) n - nrow(x) lct - Sys.getlocale(LC_TIME) Sys.setlocale(LC_TIME, C) on.exit(Sys.setlocale(LC_TIME, lct)) dat - gsub( , 0, as.character(x[, 1])) dat - as.POSIXct(strptime(dat, %Y-%m-%d), tz = GMT) if (dat[n] != start) cat(format(dat[n], time series starts %Y-%m-%d\n)) if (dat[1] != end) cat(format(dat[1], time series ends %Y-%m-%d\n)) return(data.frame(cbind(Date=I(format(dat[n:1],%Y-%m-%d)),x[n:1,nser]),row .names=1:n)) } else stop(Provider not implemented) } -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Daniele Amberti Sent: Friday, February 09, 2007 5:22 AM To: r-help Subject: [R] get.hist.quote problem yahoo I have functions using get.hist.quote() from library tseries. It seems that something changed (yahoo) and function get broken. try with a simple get.hist.quote('IBM') and let me kow if for someone it is still working. I get this error: Error in if (!quiet dat[n] != start) cat(format(dat[n], time series starts %Y-%m-%d\n)) : missing value where TRUE/FALSE needed Looking at the code it seems that before the format of dates in yahoo's cv file was not iso. Now it is iso standard year-month-day Anyone get the same problem? -- Passa a Infostrada. ADSL e Telefono senza limiti e senza canone Telecom http://click.libero.it/infostrada9feb07 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R CMD check fails at package dependencies check on Fedora Core 5, works on other systems
Marc Schwartz (via MN) writes: On Tue, 2006-09-19 at 22:16 +1000, Robert King wrote: Here is another thing that might help work out what is happening. If I use --no-install, ade4 actually fails as well, in the same way as zipfR. [Desktop]$ R CMD check --no-install ade4 * checking for working latex ... OK * using log directory '/home/rak776/Desktop/ade4.Rcheck' * using Version 2.3.1 (2006-06-01) * checking for file 'ade4/DESCRIPTION' ... OK * this is package 'ade4' version '1.4-1' * checking if this is a source package ... OK * checking package directory ... OK * checking for portable file names ... OK * checking for sufficient/correct file permissions ... OK * checking DESCRIPTION meta-information ... ERROR [Desktop]$ R CMD check --no-install zipfR * checking for working latex ... OK * using log directory '/home/rak776/Desktop/zipfR.Rcheck' * using Version 2.3.1 (2006-06-01) * checking for file 'zipfR/DESCRIPTION' ... OK * checking extension type ... Package * this is package 'zipfR' version '0.6-0' * checking if this is a source package ... OK * checking package directory ... OK * checking for portable file names ... OK * checking for sufficient/correct file permissions ... OK * checking DESCRIPTION meta-information ... ERROR snip Robert, I tried the process last night (my time) using the initial instructions on my FC5 system with: $ R --version R version 2.3.1 Patched (2006-08-06 r38829) Copyright (C) 2006 R Development Core Team I could not replicate the problem. However, this morning, with your additional communication: $ R CMD check --no-install zipfR_0.6-0.tar.gz * checking for working latex ... OK * using log directory '/home/marcs/Downloads/zipfR.Rcheck' * using Version 2.3.1 Patched (2006-08-06 r38829) * checking for file 'zipfR/DESCRIPTION' ... OK * checking extension type ... Package * this is package 'zipfR' version '0.6-0' * checking if this is a source package ... OK * checking package directory ... OK * checking for portable file names ... OK * checking for sufficient/correct file permissions ... OK * checking DESCRIPTION meta-information ... OK * checking top-level files ... OK * checking index information ... OK * checking package subdirectories ... OK * checking R files for syntax errors ... OK * checking R files for library.dynam ... OK * checking S3 generic/method consistency ... OK * checking replacement functions ... OK * checking foreign function calls ... OK * checking Rd files ... OK * checking Rd cross-references ... WARNING Warning in grep(pattern, x, ignore.case, extended, value, fixed, useBytes) : input string 70 is invalid in this locale * checking for missing documentation entries ... WARNING Warning in grep(pattern, x, ignore.case, extended, value, fixed, useBytes) : input string 70 is invalid in this locale All user-level objects in a package should have documentation entries. See chapter 'Writing R documentation files' in manual 'Writing R Extensions'. * checking for code/documentation mismatches ... OK * checking Rd \usage sections ... OK * checking DVI version of manual ... OK WARNING: There were 2 warnings, see /home/marcs/Downloads/zipfR.Rcheck/00check.log for details So I am wondering if this raises the possibility of a locale issue on your FC5 system resulting in a problem reading DESCRIPTION files? It may be totally unrelated, but one never knows I suppose. Mine is: $ locale LANG=en_US.UTF-8 LC_CTYPE=en_US.UTF-8 LC_NUMERIC=en_US.UTF-8 LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=en_US.UTF-8 LC_ADDRESS=en_US.UTF-8 LC_TELEPHONE=en_US.UTF-8 LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=en_US.UTF-8 LC_ALL= HTH, Marc Schwartz That's a bug in tools:::Rd_aliases (it needs to preprocess the Rd lines, which re-encodes if necessary and possible). I'll commit a fix later today. Thanks for spotting this. Best -k __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Job Openings at WU Wien
The Department of Statistics and Mathematics at the Vienna University of Economics and Business Administration invites applications for two new faculty positions in computational statistics and quantitative research methodology, to begin in fall 2005. The positions will be at the Assistant level. Candidates should have a strong potential for statistical computing or intramural research support and statistical consulting, be interested in involving graduate and undergraduate students in their research, have completed their Ph.D. or a comparable degree by June 2005, and be citizens of the European Union. We seek candidates who can teach a graduate course in advanced applied statistics and quantitative research methodology and other courses at the graduate and undergraduate level, mentor students in undergraduate and graduate research projects as well as Master's and Ph.D. theses. Candidates should have a strong background in one of the following areas: psychometrics, computational management or social sciences, and information systems. Desirable knowledge and skills include topics such as statistical software development, quantitative research methodology, and advanced applied statistical techniques. The Department of Statistics and Mathematics at the Vienna University of Economics and Business Administration (WU Wien) has a strong research focus with currently 14 full time faculty with substantial graduate and undergraduate teaching responsibilities. It maintains a leading position in the development of R, a comprehensive open source environment for statistical computing. WU Wien (http://www.wu-wien.ac.at/english/about) is one of the leading Central European institutions for international business education with about 20,000 students and more than 1,000 full-time and adjunct faculty and staff members. Applicants should submit a letter of interest (with reference numbers 43448 [6-yr position] or 42948 [4-yr position]), current vitae, recent papers, etc., by July 18, 2005 to PERSONALABTEILUNG Wirtschaftsuniversitaet Wien Augasse 2-6 1090 Vienna Austria Kurt Hornik, Chair Department of Statistics and Mathematics Wirtschaftsuniversitaet Wien __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] algorithms for matching and Hungarian method
Martin Olivier writes: Hi all, I would like to match two partitions. That is, if I have exactly the same objects grouped together for the two partitions, the labels may be arbitrarly permuted. and so, i would like to know the correspondances of the groups between the two clusterings. In the e1701 pachage, it is possible to use the function matchClasses() for this problem. The problem is that for k greater than 10 (k number of classes), I have a memory problem. So I would like to know if this function explicitly examine all k! possible matches, or if it uses the Hungarian method (or an other optimal algorithm). If not, do you know if I can find one. e1071::matchClasses() does not. However, clue::solve_LSAP() provides an implementation of the Hungarian method for solving the LSAP. Of course, you can also use the Simplex algorithm for solving the LSAP, and in fact lpSolve::lp.assign() does that for you. Hth -k __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html