Re: [R] Time complexity of functions in R

2018-05-23 Thread Suzen, Mehmet
Hello Neha,

You can try to measure those instructions time-complexiy by yourself.
First, generate a benchmark dataset
with increasing object size, i.e., set A. Have a look at how to use
'system.time'

https://stat.ethz.ch/R-manual/R-devel/library/base/html/system.time.html

Best,
Mehmet

On 24 May 2018 at 04:40, Neha Aggarwal  wrote:
> Hi,
>
> I have implemented an algorithm in R, where i have used while loop and some
> set operations inside it, for example,
>
> while(condition){
> union(set A,set B)
> set C - set D
> intersection(set D, set E)
> }
> I want to calculate the complexity of my algo. Can you tell me the
> complexity of union, intersection and set minus operations/functions in R?
> Is it O(n) or O(log n) ?
>
> Also can anyone point a good resource for me to read about it?
>
> Thanks,
> Neha
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: Alternative for numpy in Spark Mlib

2018-05-23 Thread Suzen, Mehmet
You can  use Breeze, which is part of spark distribution:
https://github.com/scalanlp/breeze/wiki/Breeze-Linear-Algebra

Check out the modules under  import breeze._

On 23 May 2018 at 07:04, umargeek  wrote:
> Hi Folks,
>
> I am planning to rewrite one of my python module written for entropy
> calculation using numpy into Spark Mlib so that it can be processed in
> distributed manner.
>
> Can you please advise on the possibilities of the same approach or any
> alternatives.
>
> Thanks,
> Umar
>
>
>
> --
> Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
>
> -
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Re: [R] Comparing figures?

2018-05-07 Thread Suzen, Mehmet
I suggest perceptual diff. You could write a wrapper around it.

http://pdiff.sourceforge.net

On Mon, 7 May 2018 16:49 Ramiro Barrantes, 
wrote:

> Hello,
>
> I am working on tests to compare figures.  I have been using ImageMagick,
> which creates a figure signature, and I can compare a "test" figure
> signature against a saved "reference" figure signature.  It seems to work
> pretty well.  However, it is slow as it requires reading from the file
> system.
>
> Are there any options to compare figures on memory?  For example, if I
> generate a ggplot or lattice graph, I could have all my saved "reference"
> figures on memory (which I would have loaded all at once) and compare them.
> I just haven't found anything.
>
> I just found out about the vdiffr package and was going to explore it, not
> sure about the speed.
>
> Any suggestions appreciated.
>
> Thank you,
> <
> https://west.exch023.serverdata.net/owa/?ae=Item=New=IPM.Note=MTQuMy4zMTkuMixlbi1VUyw2LEhUTUwsMCww=_1525698150389_875737489#
> >
> Ramiro
>
> Ramiro Barrantes Ph.D.
> Precision Bioassay, Inc.
> 431 Pine St., Suite 110
> Burlington, VT 05401
> 802 865 0155
> 802 861 2365 FAX
> www.precisionbioassay.com<
> https://west.exch023.serverdata.net/owa/redir.aspx?SURL=wN3KzpoKXAcetH7sTOTnSyfg-iAXFIinpPUtRcduCFCtkgZrUSDTCGgAdAB0AHAAOgAvAC8AdwB3AHcALgBwAHIAZQBjAGkAcwBpAG8AbgBiAGkAbwBhAHMAcwBhAHkALgBjAG8AbQA.=http%3a%2f%2fwww.precisionbioassay.com
> >
> ram...@precisionbioassay.com
>
> CONFIDENTIALITY NOTICE: This email, including any attach...{{dropped:9}}
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R-pkg-devel] Fwd: Collaboration Request: h2o R Package Function Cheatsheet

2018-02-04 Thread Suzen, Mehmet
Dear Juan,

A good start. A suggestion for versioning, instead of versioning on
file names, maybe you can use git tags for release numbers. Github
will create a release bundle with you release tag.  RStudio has nice
templates for cheatsheets too [1], I think you use their template  and
possibly you could contribute there.

Best,
-m

[1] 
https://www.rstudio.com/resources/cheatsheets/how-to-contribute-a-cheatsheet/


Mehmet Süzen



On 3 February 2018 at 23:00, Juan Telleria Ruiz de Aguirre
 wrote:
> Dear R Package Developers,
>
> I have just started doing a cheatsheet for h2o R Package:
>
> https://cran.r-project.org/web/packages/h2o/index.html
>
> So if anyone is interested in contribution, I attach what I have done
> till now in Github:
>
> https://github.com/jtelleria/H2O-Cheatsheet
>
> A H2O.ai Statistical Algorithms Cheatsheet already exists, but the new
> one will be focused on R h2o package functions:
>
> https://github.com/h2oai/h2o-tutorials/blob/master/training/h2o_algos/h2o_algos_cheat_sheet_04_25_17.pdf
>
> Kind regards,
> Juan Telleria
>
> __
> R-package-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel

Re: [Rd] Why R should never move to git

2018-01-31 Thread Suzen, Mehmet
On 31 January 2018 at 16:18, Barry Rowlingson
 wrote:
>>
>
> Let the record also state that *gitlab* is an open source project and can be
> downloaded and self-hosted, like gogs, but unlike github.


Good to know. Nice one: https://github.com/gitlabhq/gitlabhq

Best,
-m


> PS I've been running a gitlab instance for my group for a couple of years on
> a private server.

Is it a smooth ride so far?

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Best practices in developing package: From a single file

2018-01-31 Thread Suzen, Mehmet
Dear Dr. Pfaff,

Thank you for this, creating a package out of single file was my
oriingal question, but not only creating and also maintaining it that
way so R package is an artifact of the development process rather than
"manually maintained" structure. I will have look at your sources.

Best,

Mehmet Süzen



On 31 January 2018 at 15:51, Pfaff, Bernhard Dr.
 wrote:
> Dear All:
>
> stepping in late, but @Joris, if you would like to take 'from a single file' 
> literally,
> have a look at:
>
> https://github.com/bpfaff/lp4rp
>
> (lp4rp: literate programming for R packages);
>
> Cheers,
> Bernhard
>
> ps:  incidentally, within the noweb-file roxygen is employed.
>
> -Ursprüngliche Nachricht-
> Von: R-devel [mailto:r-devel-boun...@r-project.org] Im Auftrag von Joris Meys
> Gesendet: Mittwoch, 31. Januar 2018 14:02
> An: Duncan Murdoch
> Cc: r-devel
> Betreff: [EXT] Re: [Rd] Best practices in developing package: From a single 
> file
>
> On Wed, Jan 31, 2018 at 1:41 PM, Duncan Murdoch 
> wrote:
>
>> On 31/01/2018 6:33 AM, Joris Meys wrote:
>>
>> 3. given your criticism, I'd like your opinion on where I can improve
>> the
>>> documentation of https://github.com/CenterForStatistics-UGent/pim.
>>> I'm currently busy updating the help files for a next release on
>>> CRAN, so your input is more than welcome.
>>>
>>
>> After this invitation I sent some private comments to Joris.  I would
>> say his package does a pretty good job of documentation; it isn't the
>> kind of Roxygen-using package that I was complaining about.  So I will
>> say I have received an example of a Roxygen-using package that has
>> good help pages.
>>
>
> Thank you for the nice compliment and the valuable tips.
>
> --
> Joris Meys
> Statistical consultant
>
> Department of Data Analysis and Mathematical Modelling Ghent University 
> Coupure Links 653, B-9000 Gent (Belgium) 
> 
>
> ---
> Biowiskundedagen 2017-2018
> http://www.biowiskundedagen.ugent.be/
>
> ---
> Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
> *
> Confidentiality Note: The information contained in this ...{{dropped:10}}
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Why R should never move to git

2018-01-30 Thread Suzen, Mehmet
Gabor, I was just pointing out options. I think it is more of a policy
decision than a technical one. For example, the very mailing list we
are using is run by ETH Zurich with Martin Maechler. But it can well
be run on google groups. Maybe this list should also move to google
groups, it is unlikely that Google would shut down google groups soon.

Best,
-m

On 31 January 2018 at 00:26, Gábor Csárdi <csardi.ga...@gmail.com> wrote:
> While this is a very hypothetical argument, you could at least explain
> _why_ you would think so.
>
> If you were thinking about the unlikely event of GitHub / GitLab
> closing business, that is _not_ such a big to any active project that
> is hosted there.
>
> Gabor
>
> On Tue, Jan 30, 2018 at 11:07 PM, Suzen, Mehmet <mehmet.su...@gmail.com> 
> wrote:
>> This might be off topic, but if R-core development ever moves to git,
>> I think it would make sense to have its own git service hosted by a
>> university, rather than using
>> github or gitlab. It is possible via https://gogs.io/ project.
>>
>> Just for the record.
>>
>> Best,
>> -m
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Why R should never move to git

2018-01-30 Thread Suzen, Mehmet
This might be off topic, but if R-core development ever moves to git,
I think it would make sense to have its own git service hosted by a
university, rather than using
github or gitlab. It is possible via https://gogs.io/ project.

Just for the record.

Best,
-m

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Best practices in developing package: From a single file

2018-01-30 Thread Suzen, Mehmet
On 30 January 2018 at 21:31, Cook, Malcolm  wrote:
>
> I think you want to see the approach to generating a skeleton from a single 
> .R file presented in:
>
> Simple and sustainable R packaging using inlinedocs 
> http://inlinedocs.r-forge.r-project.org/
>
> I have not used it in some time but found it invaluable when I did.

For the record, the package has a JSS article as well:

https://www.jstatsoft.org/article/view/v054i06


Best,
-m

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Best practices in developing package: From a single file

2018-01-30 Thread Suzen, Mehmet
Dear All,

Thank you for all valuable input and sorry for the off-topic for the
list. I will try R-pkg-devel for further related questions.   I was
actually after "one-go" auto-documentation in-line or out of comments
from a single file/environment in a similar spirit to
'package.skeleton or an extension of it. My take-home message or
summary from all responses do far.

* Regarding documentation;Duncan Murdoch's  wisdom "...to get good
stuff in the help page, you need just as much work as in writing the
.Rd file directly..". So there is no silver bullet in terms of
auto-documentation, I gather, especially for considering if one uses
more complex constructs, S4/S6 classes or Rcpp code behind.On the
other hand, roxgen2 being the most comprehensive solution.

* Lightweight solution to try out before moving to RStudio fully. I
will give a try Dirk's 'pkgKitten' and 'inlinedocs' Malcolm mentioned.

Interestingly, responses have reminded me Larry Wall's quote
(https://en.wikipedia.org/wiki/There%27s_more_than_one_way_to_do_it),
which I think really applies to R more than any language I encounter
so far, from different class systems to different time-series
representations, so richly democratised.

Many regards,
Mehmet


On 30 January 2018 at 17:00, Suzen, Mehmet <mehmet.su...@gmail.com> wrote:
> Dear R developers,
>
> I am wondering what are the best practices for developing an R
> package. I am aware of Hadley Wickham's best practice
> documentation/book (http://r-pkgs.had.co.nz/).  I recall a couple of
> years ago there were some tools for generating a package out of a
> single file, such as using package.skeleton, but no auto-generated
> documentation. Do you know a way to generate documentation and a
> package out of single R source file, or from an environment?
>
> Many thanks,
> Mehmet

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Best practices in developing package: From a single file

2018-01-30 Thread Suzen, Mehmet
Dear R developers,

I am wondering what are the best practices for developing an R
package. I am aware of Hadley Wickham's best practice
documentation/book (http://r-pkgs.had.co.nz/).  I recall a couple of
years ago there were some tools for generating a package out of a
single file, such as using package.skeleton, but no auto-generated
documentation. Do you know a way to generate documentation and a
package out of single R source file, or from an environment?

Many thanks,
Mehmet

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] R CMD check warning about compiler warning flags

2017-12-25 Thread Suzen, Mehmet
On 26 December 2017 at 00:00, Juan Telleria  wrote:
> Maybe I'm new, and forgive my ignorance, but maybe in the future (~ X years
> from now) the R Project could be managed entirely from github, by doing

I strongly disagree. Are you aware that github is a commercial
company, github inc. [1] ?
What about gitlab? or Microsoft's codeplex? There are other services
similar to github, why github?
What happens if github goes out of business?

R-project should be maintained in the academic network and under
auspices of universities.


 [*]  GitHub, Inc.
   https://en.wikipedia.org/wiki/GitHub

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] binary form of is() contradicts its unary form

2017-11-30 Thread Suzen, Mehmet
On 30 November 2017 at 16:30, Iñaki Úcar  wrote:
> If you really believe that references should be needed to know what to
> expect from a function call, then we work with different definitions

A behaviour of a function call might be quite complex depending on
the arguments characteristics,  it may not possible always to boil
down all possible behaviours of a function to a man page as in `?`,
and sometimes giving a reference to a larger exposition makes more
sense.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] binary form of is() contradicts its unary form

2017-11-30 Thread Suzen, Mehmet
On 30 Nov 2017 14:32, "Iñaki Úcar"  wrote:

>>
>> Am I supposed to read every reference on a man page just to know what
>> to expect from a function?
>>
>
> If the reference is from John Chamber, you are supposed to read it.

As a joke, it's funny.



Not a joke. John Chambers is the authority in R object systems. Please do
not mock him or resources pointing to his works.



> It is not always possible for maintainers to document everything on a man
page.

My only point is that Hervé's concern is perfectly legitimate given
the output of "?is". Whether the inconsistency is in the behaviour of
the function or in the documentation, that I don't know. Personally, I


There is no inconsistency as far as I understood; Data.frame do not have a
pure S4 super-class hierachy.

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] binary form of is() contradicts its unary form

2017-11-30 Thread Suzen, Mehmet
On 30 November 2017 at 14:04, Iñaki Úcar  wrote:
>
> Am I supposed to read every reference on a man page just to know what
> to expect from a function?
>

If the reference is from John Chamber, you are supposed to read it.
It is not always possible for maintainers to document everything on a man page.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] binary form of is() contradicts its unary form

2017-11-30 Thread Suzen, Mehmet
On 30 November 2017 at 11:37, Iñaki Úcar <i.uca...@gmail.com> wrote:
> 2017-11-30 3:14 GMT+01:00 Suzen, Mehmet <mehmet.su...@gmail.com>:
>> My understanding is that there is no inconsistency. `is` does what it
>> claims, from the documentation:
>>
>> ‘is’: With two arguments, tests whether ‘object’ can be treated as
>>   from ‘class2’.
>>
>>   With one argument, returns all the super-classes of this
>>   object's class.
>
> Note that this is not in the documentation since a year ago.
>

As far as I understood and gather, starting from methods v3.3.2, the following
new reference is added:

* Chambers, John M. (2016) Extending R, Chapman & Hall. (Chapters 9 and 10.)

Pushing that details there, I assume.

Best,
Mehmet

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] binary form of is() contradicts its unary form

2017-11-29 Thread Suzen, Mehmet
On 29 November 2017 at 21:45, Hervé Pagès  wrote:
> You're missing the point of my original post. Which is that
> there is a serious inconsistency between the unary and binary
> forms of is(). Maybe the binary form is right in case of

My understanding is that there is no inconsistency. `is` does what it
claims, from the documentation:

‘is’: With two arguments, tests whether ‘object’ can be treated as
  from ‘class2’.

  With one argument, returns all the super-classes of this
  object's class.

Important verb there is 'can be treated as from' with two arguments. So,
one can not treat `data.frame` as from 'list' class in a simple sense,
even though it inherits
from list. The complication is that list is a Primitive and this is
not coming from a
clean S4 hierarchy c.f, your A, B example.

Also, strictly speaking, having super-classes resolved does not
automatically qualify an
assumption that the object can be treated as a class of one of its
super-classes.

Cheers,
Mehmet

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] binary form of is() contradicts its unary form

2017-11-29 Thread Suzen, Mehmet
Hi Herve,

Interesting observation with `setClass` but it is for S4.  It looks
like `data.frame()` is not an S4 class.

> isS4(data.frame())
[1] FALSE

And in your case this might help:

> is(asS4(data.frame()), "list")
[1] TRUE

Looks like `is` is designed for S4 classes, I am not entirely sure.

Best,
-Mehmet

On 29 November 2017 at 20:46, Hervé Pagès <hpa...@fredhutch.org> wrote:
> Hi Mehmet,
>
> On 11/29/2017 11:22 AM, Suzen, Mehmet wrote:
>>
>> Hi Herve,
>>
>> I think you are confusing subclasses and classes. There is no
>> contradiction. `is` documentation
>> is very clear:
>>
>> `With one argument, returns all the super-classes of this object's class.`
>
>
> Yes that's indeed very clear. So if "list" is a super-class
> of "data.frame" (as reported by is(data.frame())), then
> is(data.frame(), "list") should be TRUE.
>
> With S4 classes:
>
>   setClass("A")
>   setClass("B", contains="A")
>
>   ## Get all the super-classes of B.
>   is(new("B"))
>   # [1] "B" "A"
>
>   ## Does a B object inherit from A?
>   is(new("B"), "A")
>   # [1] TRUE
>
> Cheers,
> H.
>
>>
>> Note that object class is always `data.frame` here, check:
>>
>>  > class(data.frame())
>> [1] "data.frame"
>>  > is(data.frame(), "data.frame")
>> [1] TRUE
>>
>> Best,
>> Mehmet
>>
>>
>>
>>
>>
>> On 29 Nov 2017 19:13, "Hervé Pagès" <hpa...@fredhutch.org
>> <mailto:hpa...@fredhutch.org>> wrote:
>>
>> Hi,
>>
>> The unary forms of is() and extends() report that data.frame
>> extends list, oldClass, and vector:
>>
>>> is(data.frame())
>>[1] "data.frame" "list"   "oldClass"   "vector"
>>
>>> extends("data.frame")
>>[1] "data.frame" "list"   "oldClass"   "vector"
>>
>> However, the binary form of is() disagrees:
>>
>>> is(data.frame(), "list")
>>[1] FALSE
>>> is(data.frame(), "oldClass")
>>[1] FALSE
>>> is(data.frame(), "vector")
>>[1] FALSE
>>
>> while the binary form of extends() agrees:
>>
>>> extends("data.frame", "list")
>>[1] TRUE
>>> extends("data.frame", "oldClass")
>>[1] TRUE
>>> extends("data.frame", "vector")
>>[1] TRUE
>>
>> Who is right?
>>
>> Shouldn't 'is(object, class2)' be equivalent
>> to 'class2 %in% is(object)'? Furthermore, is there
>> any reason why 'is(object, class2)' is not implemented
>> as 'class2 %in% is(object)'?
>>
>> Thanks,
>> H.
>>
>> --
>> Hervé Pagès
>>
>> Program in Computational Biology
>> Division of Public Health Sciences
>> Fred Hutchinson Canc
>>
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__maps.google.com_-3Fq-3DFred-2BHutchinson-2BCanc-26entry-3Dgmail-26source-3Dg=DwMFaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=AptypGUf1qnpkFcOc1eU_vdGSHsush3RGVUyjk7yDu8=sTr3VPPxYCZLOtlBS3DToP4-Wi44EOLs99gJcV932b0=>er
>> Research Center
>> 1100 Fairview Ave. N, M1-B514
>> P.O. Box 19024
>> Seattle, WA 98109-1024
>>
>> E-mail: hpa...@fredhutch.org <mailto:hpa...@fredhutch.org>
>> Phone:  (206) 667-5791
>> Fax:(206) 667-1319
>>
>> __
>> R-devel@r-project.org <mailto:R-devel@r-project.org> mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel=DwMFaQ=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=AptypGUf1qnpkFcOc1eU_vdGSHsush3RGVUyjk7yDu8=OzNPwqjAWVsXOGKMCmd4Fa7Udcm21ewfJmUN78LenQY=>
>>
>
> --
> Hervé Pagès
>
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M1-B514
> P.O. Box 19024
> Seattle, WA 98109-1024
>
> E-mail: hpa...@fredhutch.org
> Phone:  (206) 667-5791
> Fax:(206) 667-1319

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] binary form of is() contradicts its unary form

2017-11-29 Thread Suzen, Mehmet
Hi Herve,

I think you are confusing subclasses and classes. There is no
contradiction. `is` documentation
is very clear:

`With one argument, returns all the super-classes of this object's class.`

Note that object class is always `data.frame` here, check:

> class(data.frame())
[1] "data.frame"
> is(data.frame(), "data.frame")
[1] TRUE

Best,
Mehmet





On 29 Nov 2017 19:13, "Hervé Pagès"  wrote:

> Hi,
>
> The unary forms of is() and extends() report that data.frame
> extends list, oldClass, and vector:
>
>   > is(data.frame())
>   [1] "data.frame" "list"   "oldClass"   "vector"
>
>   > extends("data.frame")
>   [1] "data.frame" "list"   "oldClass"   "vector"
>
> However, the binary form of is() disagrees:
>
>   > is(data.frame(), "list")
>   [1] FALSE
>   > is(data.frame(), "oldClass")
>   [1] FALSE
>   > is(data.frame(), "vector")
>   [1] FALSE
>
> while the binary form of extends() agrees:
>
>   > extends("data.frame", "list")
>   [1] TRUE
>   > extends("data.frame", "oldClass")
>   [1] TRUE
>   > extends("data.frame", "vector")
>   [1] TRUE
>
> Who is right?
>
> Shouldn't 'is(object, class2)' be equivalent
> to 'class2 %in% is(object)'? Furthermore, is there
> any reason why 'is(object, class2)' is not implemented
> as 'class2 %in% is(object)'?
>
> Thanks,
> H.
>
> --
> Hervé Pagès
>
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Canc
> er
> Research Center
> 1100 Fairview Ave. N, M1-B514
> P.O. Box 19024
> Seattle, WA 98109-1024
>
> E-mail: hpa...@fredhutch.org
> Phone:  (206) 667-5791
> Fax:(206) 667-1319
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [R] run r script in r-fiddle

2017-10-31 Thread Suzen, Mehmet
Dear List,

According to datacamp support team, r-fiddle.org is not supported. We
asked them to put it down as Professor Maechler suggested it is
a waste of time for the R-help to respond to questions on something
not maintained and severely outdated. If you would like to use
R from your browser, you can embed the following into a web page:

https://cdn.datacamp.com/datacamp-light-latest.min.js&quot</a>;>


Currently, it supports R 3.4.0. See the code base, which is open
source, here https://github.com/datacamp/datacamp-light

Hope it helps.

Best,
Mehmet



On 31 October 2017 at 15:09, Suzen, Mehmet <msu...@gmail.com> wrote:
> On 31 October 2017 at 12:42, Martin Maechler <maech...@stat.math.ethz.ch> 
> wrote:
>> Notably as I think it's been provided by a company that no
>> longer exists under that name, and even if that'd be wrong,  R-Fiddle
>> does not seem free software (apart from the R parts, I hope !).
>
> For the record, r-fiddle is maintained by datacamp:
> https://www.datacamp.com/community/blog/r-fiddle-an-online-playground-for-r-code-2

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] run r script in r-fiddle

2017-10-31 Thread Suzen, Mehmet
On 31 October 2017 at 12:42, Martin Maechler  wrote:
> Notably as I think it's been provided by a company that no
> longer exists under that name, and even if that'd be wrong,  R-Fiddle
> does not seem free software (apart from the R parts, I hope !).

For the record, r-fiddle is maintained by datacamp:
https://www.datacamp.com/community/blog/r-fiddle-an-online-playground-for-r-code-2

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] run r script in r-fiddle

2017-10-30 Thread Suzen, Mehmet
 Note that, looks like r-fiddle runs R 3.1.2.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] run r script in r-fiddle

2017-10-30 Thread Suzen, Mehmet
We were talking about r-fiddle. It gives error there [*], that's why I
suggested using RCurl.

> source("https://raw.githubusercontent.com/msuzen/isingLenzMC/master/R/isingUtils.R;)
...
unsupported URL scheme
Error : cannot open the connection
>

On 30 October 2017 at 15:51, Martin Maechler <maech...@stat.math.ethz.ch> wrote:
>>>>>> Suzen, Mehmet <msu...@gmail.com>
>>>>>> on Mon, 30 Oct 2017 11:16:30 +0100 writes:
>
> > Hi Frank, You could upload your R source file to a public
> > URL, for example to github and read via RCurl, as source
> > do not support https as far as I know.
>
> well... but your knowledge is severely (:-) outdated.
> Why did you not try first?
>
> source("https://raw.githubusercontent.com/msuzen/isingLenzMC/master/R/isingUtils.R;)
>
> works for me even in R 3.3.0 which is really outdated itself!
>

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] run r script in r-fiddle

2017-10-30 Thread Suzen, Mehmet
Hi Frank,

You could upload your R source file to a public URL, for example to
github and read via RCurl,
as source do not support https as far as I know. Here is a working example.

library('RCurl')
tmatrix <- 
getURL("https://raw.githubusercontent.com/msuzen/isingLenzMC/master/R/isingUtils.R;)
eval(parse(text=tmatrix))

Not that you need to use raw URL for github file.

Best,
-m





On 30 October 2017 at 01:56, Frank Mei  wrote:
> Hi All,
>
> I want to know how to run an R file on my computer in R-Fiddle?
>
> I tried source("filename.r"), but not working.
>
> thanks,
> Frank
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Regarding Principal Component Analysis result Interpretation

2017-09-15 Thread Suzen, Mehmet
Usually, PCA is used for a large number of features. FactoMineR [1]
package provides a couple of examples, check for temperature example.
But you may want to consult to basic PCA material as well, I suggest a
book from Chris Bishop [2].


[1] https://cran.r-project.org/web/packages/FactoMineR/vignettes/clustering.pdf
[2] http://www.springer.com/de/book/9780387310732?referer=www.springer.de

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: RDD order preservation through transformations

2017-09-15 Thread Suzen, Mehmet
Hi Johan,
 DataFrames are building on top of RDDs, not sure if the ordering
issues are different there. Maybe you could create minimally large
enough simulated data and example series of transformations as an
example to experiment on.
Best,
-m

Mehmet Süzen, MSc, PhD


| PRIVILEGED AND CONFIDENTIAL COMMUNICATION This e-mail transmission,
and any documents, files or previous e-mail messages attached to it,
may contain confidential information that is legally privileged. If
you are not the intended recipient or a person responsible for
delivering it to the intended recipient, you are hereby notified that
any disclosure, copying, distribution or use of any of the information
contained in or attached to this transmission is STRICTLY PROHIBITED
within the applicable law. If you have received this transmission in
error, please: (1) immediately notify me by reply e-mail to
su...@acm.org,  and (2) destroy the original transmission and its
attachments without reading or saving in any manner. |


On 15 September 2017 at 09:44,   wrote:
> Thanks all for your answers. After reading the provided links I am still 
> uncertain of the details of what I'd need to do to get my calculations right 
> with RDDs. However I discovered DataFrames and Pipelines on the "ML" side of 
> the libs and I think they'll be better suited to my needs.
>
> Best,
> Johan Grande
>
>
> _
>
> Ce message et ses pieces jointes peuvent contenir des informations 
> confidentielles ou privilegiees et ne doivent donc
> pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu 
> ce message par erreur, veuillez le signaler
> a l'expediteur et le detruire ainsi que les pieces jointes. Les messages 
> electroniques etant susceptibles d'alteration,
> Orange decline toute responsabilite si ce message a ete altere, deforme ou 
> falsifie. Merci.
>
> This message and its attachments may contain confidential or privileged 
> information that may be protected by law;
> they should not be distributed, used or copied without authorisation.
> If you have received this email in error, please notify the sender and delete 
> this message and its attachments.
> As emails may be altered, Orange is not liable for messages that have been 
> modified, changed or falsified.
> Thank you.
>
>
> -
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Re: RDD order preservation through transformations

2017-09-14 Thread Suzen, Mehmet
On 14 September 2017 at 10:42,   wrote:
> val noTs = myData.map(dropTimestamp)
>
> val scaled = scaler.transform(noTs)
>
> val projected = (new RowMatrix(scaled)).multiply(principalComponents).rows
>
> val clusters = myModel.predict(projected)
>
> val result = myData.zip(clusters)
>
>
>
> Do you think there’s a chance that the 4 transformations above would
> preserve order so the zip at the end would be correct?

AFAIK, No. The sequence of transformation you have will not guarantee
to preserve order.
First, apply zip, then you need to keep track of indices in the
subsequent transformations,
with `_2`, as zip returns tuples.

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Re: RDD order preservation through transformations

2017-09-13 Thread Suzen, Mehmet
I think it is one of the conceptual difference in Spark compare to
other languages, there is no indexing in plain RDDs, This was the
thread with Ankit:

Yes. So order preservation can not be guaranteed in the case of
failure. Also not sure if partitions are ordered. Can you get the same
sequence of partitions in mapPartition?

On 13 Sep 2017 19:54, "Ankit Maloo" <ankitmaloo1...@gmail.com> wrote:
>
> Rdd are fault tolerant as it can be recomputed using DAG without storing the 
> intermediate RDDs.
>
> On 13-Sep-2017 11:16 PM, "Suzen, Mehmet" <su...@acm.org> wrote:
>>
>> But what happens if one of the partitions fail, how fault tolerance recover 
>> elements in other partitions.

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Re: [Rd] establishing a Code of Conduct for R

2017-09-13 Thread Suzen, Mehmet
On 13 September 2017 at 13:22, Brian G. Peterson  wrote:
> I am not an official representative of the R team, so this is only my
> opinion.
>

Thank you.

> It seems to me that you are trying to create a solution to a problem
> which does not exist.

I am not trying to create any solution and not my decision if this is
really needed for r-project. It was just a naive suggestion, c.f,
https://opensource.guide/code-of-conduct/#why-do-i-need-a-code-of-conduct
https://www.contributor-covenant.org/

Best
Mehmet

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: RDD order preservation through transformations

2017-09-13 Thread Suzen, Mehmet
But what happens if one of the partitions fail, how fault tolarence recover
elements in other partitions.

On 13 Sep 2017 18:39, "Ankit Maloo" <ankitmaloo1...@gmail.com> wrote:

> AFAIK, the order of a rdd is maintained across a partition for Map
> operations. There is no way a map operation  can change sequence across a
> partition as partition is local and computation happens one record at a
> time.
>
> On 13-Sep-2017 9:54 PM, "Suzen, Mehmet" <su...@acm.org> wrote:
>
> I think the order has no meaning in RDDs see this post, specially zip
> methods:
> https://stackoverflow.com/questions/29268210/mind-blown-rdd-zip-method
>
> -
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>
>


Re: RDD order preservation through transformations

2017-09-13 Thread Suzen, Mehmet
I think the order has no meaning in RDDs see this post, specially zip methods:
https://stackoverflow.com/questions/29268210/mind-blown-rdd-zip-method

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Re: [R] (no subject)

2017-09-13 Thread Suzen, Mehmet
Hello David,

As error message says you have a version dependency not satisfied.
"error: Need GSL version >= 1.12". If you are using Ubuntu for example
you could do;
sudo apt-get install libgsl2

Or you can compile by yourself, I am sure there are people in LRZ can
help you on this:)

Best,
Mehmet

On 13 September 2017 at 11:23, Brayford, David  wrote:
> When I try to install gsl in R I get the error Need GSL version >= 1.12 . 
> However, I have version 2.3 of gsl installed on the system, which is picked 
> up earlier in the configure process (see below). Is it possible for someone 
> to fix this error in the configure script?
>
> checking for gsl-config... /lrz/sys/libraries/gsl/2.3/bin/gsl-config
> checking if GSL version >= 1.12... checking for gcc... gcc
> checking for C compiler default output file name... a.out
> checking whether the C compiler works... yes
> checking whether we are cross compiling... no
> checking for suffix of executables...
> checking for suffix of object files... o
> checking whether we are using the GNU C compiler... yes
> checking whether gcc accepts -g... yes
> checking for gcc option to accept ISO C89... none needed
> configure: error: Need GSL version >= 1.12
> ERROR: configuration failed for package ‘gsl’
>
>
> David
>
>
> [[alternative HTML version deleted]]
>
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] comparition of occurrence of multiple variables between two dataframes

2017-09-12 Thread Suzen, Mehmet
; NE11
>
> NE12
>
> NE21
>
> OT11
>
> OT12
>
> OT21
>
> OT22
>
> ecoval
>
> 1291
>
> 0
>
> 8
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 1192
>
> 1083
>
> 0
>
> 8
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 424
>
> 3919
>
> 1
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 380
>
> 14685
>
> 0
>
> 1
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 370
>
> 4021
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 1
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 358
>
> 5452
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 1
>
> 0
>
> 0
>
> 11
>
> 0
>
> 0
>
> 0
>
> 0
>
> 1
>
> 0
>
> 0
>
> 356
>
>
>
> The columns are the possible structures found on a tree (cavity, scar…)
>
>
>
> And the same for the data0 :
>
> CV11
>
> CV12
>
> CV13
>
> CV14
>
> CV15
>
> CV21
>
> CV22
>
> CV23
>
> CV24
>
> CV25
>
> CV26
>
> CV31
>
> CV32
>
> CV33
>
> CV41
>
> CV42
>
> CV43
>
> CV44
>
> CV51
>
> CV52
>
> IN11
>
> IN12
>
> IN13
>
> 4728
>
> 0
>
> 0
>
> 0
>
> 1
>
> 0
>
> 0
>
> 0
>
> 3
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 1
>
> 1
>
> 0
>
> 0
>
> 5339
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 1
>
> 0
>
> 0
>
> 11766
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 1
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 796
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 3561
>
> 0
>
> 0
>
> 0
>
> 1
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 1
>
> 0
>
> 0
>
> 0
>
> 0
>
> 10581
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 1
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> IN14
>
> IN21
>
> IN22
>
> IN23
>
> IN31
>
> IN32
>
> IN33
>
> IN34
>
> BA11
>
> BA12
>
> BA21
>
> DE11
>
> DE12
>
> DE13
>
> DE14
>
> DE15
>
> GR11
>
> GR12
>
> GR13
>
> GR21
>
> GR22
>
> GR31
>
> GR32
>
> 4728
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 1
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 5339
>
> 1
>
> 0
>
> 1
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 1
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 11766
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 1
>
> 1
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 796
>
> 1
>

Re: [R] comparition of occurrence of multiple variables between two dataframes

2017-09-12 Thread Suzen, Mehmet
Do you have a simplified example with a code? It is not clear to me
what do you mean by tree but if you refer to tree data structure,
maybe you could change the data structure to tree
(https://cran.r-project.org/web/packages/data.tree/vignettes/data.tree.html)
and try to write comparison of two tree objects. It might be easier
that data.frame alone.

On 12 September 2017 at 12:27, Céline Lüscher  wrote:
> Hi everyone, I need your help to solve a problem with occurrence and two 
> dataframes.
> I have an excel table of 15200 lines. Each line correspond to a tree analyzed 
> for its structures. I have all the structures in columns (48 structures). The 
> occurrence of these structures has been counted on every tree. For example, 
> the tree 12607 has 3 structures CV11, 1 structure IN12 and none (0) of the 
> rest of all the other structures. The very last column is the value given to 
> the tree, according to the structures found on it (each structure giving a 
> number of point to the tree by its presence on it).
> The question is: Are there some structures, or combination of structures, 
> which give a high value to the tree ? Of course, according to the value of 
> each structure, we can see which one has a higher value than the others (ex: 
> structure CV11 has a value of 15, structure IN12 has a value of 4). But what 
> I want to know is, if we take all the trees having a final value higher than 
> 100 (we create a new dataframe "data100"), and we compare with the trees 
> having a final value under 100 (we create another dataframe "data0"), can we 
> find a significant difference in the number and occurrence of structures 
> found on these trees? And which structure is related to trees with a higher 
> value than 100 ?
> For now, I have only a visual answer to the question. I did two boxplot of 
> the data100 and data0, and I have seen some différences : 2 structures are 
> only found in the data100, which can be caracteristic of a final value higher 
> than 100. The problem is that I’m looking for a test to prove this.
> If you have any idea or proposition for solving this problem.. it will be 
> great!
> Best wishes,
> C.
>
>
> Gesendet von Mail für Windows 10
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [Rd] Suggestion: Create On-Disk Dataframes

2017-09-04 Thread Suzen, Mehmet
It is not needed. There is a large community of developer using SparkR.
https://spark.apache.org/docs/latest/sparkr.html
It does exactly what you want.

On 3 September 2017 at 20:38, Juan Telleria  wrote:
> Dear R Developers,
>
> I would like to suggest the creation of a new S4 object class for On-Disk
> data.frames which do not fit in RAM memory, which could be called
> disk.data.frame()
>
> It could be based in rsqlite for example (By translating R syntax to SQL
> syntax for example), and the syntax and way of working of the
> disk.data.frame() class could be exactly the same than with data.frame
> objects.
>
> When the session is of, is such disk.data.frames are not saved, and
> implicit DROP TABLE could be done in all the schemas created in rsqlite.
>
> Nowadays, with the SSD disk drives such new data.frame() class could have
> sense, specially when dealing with Big Data.
>
> It is true that this new class might be slower than regular data.frame,
> data.table or tibble classes, but we would be able to handle much more
> data, even if it is at cost of speed.
>
> Also with data sampling, and use of a regular odbc connection we could do
> all the work, but for people who do not know how to use RDBMS or specific
> purpose R packages for this job, this could work.
>
> Another option would be to base this new S4 class  on feather files, but
> maybe making it with rsqlite is simply easier.
>
> A GitHub project could be created for such purpose, so that all the
> community can contribute (included me :D ).
>
> Thank you,
> Juan
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] readLines() segfaults on large file & question on how to work around

2017-09-02 Thread Suzen, Mehmet
Jennifer, Why do you try Sparkr?

https://spark.apache.org/docs/1.6.1/api/R/read.json.html

On 2 September 2017 at 23:15, Jennifer Lyon  wrote:
> Thank you for your suggestion. Unfortunately, while R doesn't segfault
> calling readr::read_file() on the test file I described, I get the error
> message:
>
> Error in read_file_(ds, locale) : negative length vectors are not allowed
>
> Jen
>
> On Sat, Sep 2, 2017 at 1:38 PM, Ista Zahn  wrote:
>
>> As s work-around I  suggest readr::read_file.
>>
>> --Ista
>>
>>
>> On Sep 2, 2017 2:58 PM, "Jennifer Lyon"  wrote:
>>
>>> Hi:
>>>
>>> I have a 2.1GB JSON file. Typically I use readLines() and
>>> jsonlite:fromJSON() to extract data from a JSON file.
>>>
>>> When I try and read in this file using readLines() R segfaults.
>>>
>>> I believe the two salient issues with this file are
>>> 1). Its size
>>> 2). It is a single line (no line breaks)
>>>
>>> I can reproduce this issue as follows
>>> #Generate a big file with no line breaks
>>> # In R
>>> > writeLines(paste0(c(letters, 0:9), collapse=""), "alpha.txt", sep="")
>>>
>>> # in unix shell
>>> cp alpha.txt file.txt
>>> for i in {1..26}; do cat file.txt file.txt > file2.txt && mv -f file2.txt
>>> file.txt; done
>>>
>>> This generates a 2.3GB file with no line breaks
>>>
>>> in R:
>>> > moo <- readLines("file.txt")
>>>
>>>  *** caught segfault ***
>>> address 0x7cff, cause 'memory not mapped'
>>>
>>> Traceback:
>>>  1: readLines("file.txt")
>>>
>>> Possible actions:
>>> 1: abort (with core dump, if enabled)
>>> 2: normal R exit
>>> 3: exit R without saving workspace
>>> 4: exit R saving workspace
>>> Selection: 3
>>>
>>> I conclude:
>>>  I am potentially running up against a limit in R, which should give a
>>> reasonable error, but currently just segfaults.
>>>
>>> My question:
>>> Most of the content of the JSON is an approximately 100K x 6K JSON
>>> equivalent of a dataframe, and I know R can handle much bigger than this
>>> size. I am expecting these JSON files to get even larger. My R code lives
>>> in a bigger system, and the JSON comes in via stdin, so I have absolutely
>>> no control over the data format. I can imagine trying to incrementally
>>> parse the JSON so I don't bump up against the limit, but I am eager for
>>> suggestions of simpler solutions.
>>>
>>> Also, I apologize for the timing of this bug report, as I know folks are
>>> working to get out the next release of R, but like so many things I have
>>> no
>>> control over when bugs leap up.
>>>
>>> Thanks.
>>>
>>> Jen
>>>
>>> > sessionInfo()
>>> R version 3.4.1 (2017-06-30)
>>> Platform: x86_64-pc-linux-gnu (64-bit)
>>> Running under: Ubuntu 14.04.5 LTS
>>>
>>> Matrix products: default
>>> BLAS: R-3.4.1/lib/libRblas.so
>>> LAPACK:R-3.4.1/lib/libRlapack.so
>>>
>>> locale:
>>>  [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
>>>  [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
>>>  [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8
>>>  [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
>>>  [9] LC_ADDRESS=C   LC_TELEPHONE=C
>>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>>>
>>> attached base packages:
>>> [1] stats graphics  grDevices utils datasets  methods   base
>>>
>>> loaded via a namespace (and not attached):
>>> [1] compiler_3.4.1
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> __
>>> R-devel@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [R] Block comment?

2017-09-02 Thread Suzen, Mehmet
AFAIK block comment is not possible
it needs to be implemented in R interpreter and defined in the
parser.'If' solution is not elegant.

On 2 September 2017 at 14:09, Uwe Ligges
 wrote:
>
>
> On 02.09.2017 11:40, Christian wrote:
>>
>> I consider it quite worth while to introduce into R syntax a nestable
>> block comment like
>>
>> #{
>> 
>> }#
>
>
> if(FALSE){
> 
> }
>
> Best,
> Uwe Ligges
>
>
>> It would make documentation more easily manageable and lucid.
>> Is there considerable need for this.
>>
>> Please, comment on this.
>> How about R core?
>>
>> Christian
>
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: Training A ML Model on a Huge Dataframe

2017-08-23 Thread Suzen, Mehmet
SGD is supported. I see I assumed you were using Scala. Looks like you can
do streaming regression, not sure of pyspark API though:

https://spark.apache.org/docs/latest/mllib-linear-methods.html#streaming-linear-regression

On 23 August 2017 at 18:22, Sea aj <saj3...@gmail.com> wrote:

> Thanks for the reply.
>
> As far as I understood mini batch is not yet supported in ML libarary. As
> for MLLib minibatch, I could not find any pyspark api.
>
>
>
> <https://mailtrack.io/> Sent with Mailtrack
> <https://mailtrack.io/install?source=signature=en=saj3...@gmail.com=22>
>
> On Wed, Aug 23, 2017 at 2:59 PM, Suzen, Mehmet <su...@acm.org> wrote:
>
>> It depends on what model you would like to train but models requiring
>> optimisation could use SGD with mini batches. See:
>> https://spark.apache.org/docs/latest/mllib-optimization.html
>> #stochastic-gradient-descent-sgd
>>
>> On 23 August 2017 at 14:27, Sea aj <saj3...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> I am trying to feed a huge dataframe to a ml algorithm in Spark but it
>>> crashes due to the shortage of memory.
>>>
>>> Is there a way to train the model on a subset of the data in multiple
>>> steps?
>>>
>>> Thanks
>>>
>>>
>>>
>>> <https://mailtrack.io/> Sent with Mailtrack
>>> <https://mailtrack.io/install?source=signature=en=saj3...@gmail.com=22>
>>>
>>
>>
>>
>> --
>>
>> Mehmet Süzen, MSc, PhD
>> <su...@acm.org>
>>
>> | PRIVILEGED AND CONFIDENTIAL COMMUNICATION This e-mail transmission, and
>> any documents, files or previous e-mail messages attached to it, may
>> contain confidential information that is legally privileged. If you are not
>> the intended recipient or a person responsible for delivering it to the
>> intended recipient, you are hereby notified that any disclosure, copying,
>> distribution or use of any of the information contained in or attached to
>> this transmission is STRICTLY PROHIBITED within the applicable law. If you
>> have received this transmission in error, please: (1) immediately notify me
>> by reply e-mail to su...@acm.org,  and (2) destroy the original
>> transmission and its attachments without reading or saving in any manner. |
>>
>
>


-- 

Mehmet Süzen, MSc, PhD
<su...@acm.org>

| PRIVILEGED AND CONFIDENTIAL COMMUNICATION This e-mail transmission, and
any documents, files or previous e-mail messages attached to it, may
contain confidential information that is legally privileged. If you are not
the intended recipient or a person responsible for delivering it to the
intended recipient, you are hereby notified that any disclosure, copying,
distribution or use of any of the information contained in or attached to
this transmission is STRICTLY PROHIBITED within the applicable law. If you
have received this transmission in error, please: (1) immediately notify me
by reply e-mail to su...@acm.org,  and (2) destroy the original
transmission and its attachments without reading or saving in any manner. |


Re: Training A ML Model on a Huge Dataframe

2017-08-23 Thread Suzen, Mehmet
It depends on what model you would like to train but models requiring
optimisation could use SGD with mini batches. See:
https://spark.apache.org/docs/latest/mllib-optimization.html#stochastic-gradient-descent-sgd

On 23 August 2017 at 14:27, Sea aj  wrote:

> Hi,
>
> I am trying to feed a huge dataframe to a ml algorithm in Spark but it
> crashes due to the shortage of memory.
>
> Is there a way to train the model on a subset of the data in multiple
> steps?
>
> Thanks
>
>
>
>  Sent with Mailtrack
> 
>



-- 

Mehmet Süzen, MSc, PhD


| PRIVILEGED AND CONFIDENTIAL COMMUNICATION This e-mail transmission, and
any documents, files or previous e-mail messages attached to it, may
contain confidential information that is legally privileged. If you are not
the intended recipient or a person responsible for delivering it to the
intended recipient, you are hereby notified that any disclosure, copying,
distribution or use of any of the information contained in or attached to
this transmission is STRICTLY PROHIBITED within the applicable law. If you
have received this transmission in error, please: (1) immediately notify me
by reply e-mail to su...@acm.org,  and (2) destroy the original
transmission and its attachments without reading or saving in any manner. |


Re: [R] Directional Forecast

2017-08-11 Thread Suzen, Mehmet
I suggest, you read:
Forecasting: principles and practice from Hyndman-Athana­sopou­los
https://www.otexts.org/fpp

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Nested cross validation with lapply

2017-08-08 Thread Suzen, Mehmet
Hi Jesús,

Do you have a code you tried without lapply? Why don't you post that here too?

There are a couple of packages supporting nested CV; TANDEM, blkbox
you may want to check their code.

Also, `cvTools` package may help you to write one.

On 7 August 2017 at 15:21, Jesús Para Fernández
 wrote:
> Hi all!!
>
> How can i do nested cross validation with lapply??
>
> I know caret package, but I want to do it manuallly using lapply instead for 
> bucle.
>
> Thanks!!
> Jesús
>
> [[alternative HTML version deleted]]
>
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: How can i remove the need for calling cache

2017-08-02 Thread Suzen, Mehmet
On 3 August 2017 at 03:00, Vadim Semenov  wrote:
> `saveAsObjectFile` doesn't save the DAG, it acts as a typical action, so it
> just saves data to some destination.

Yes, that's what I thought, so the statement "..otherwise saving it on
a file will require recomputation."  from the book is not entirely
true.

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Re: How can i remove the need for calling cache

2017-08-02 Thread Suzen, Mehmet
On 3 August 2017 at 01:05, jeff saremi  wrote:
> Vadim:
>
> This is from the Mastering Spark book:
>
> "It is strongly recommended that a checkpointed RDD is persisted in memory,
> otherwise saving it on a file will require recomputation."

Is this really true? I had the impression that DAG will not be carried
out once RDD is serialized to an external file, so 'saveAsObjectFile'
saves DAG as well?

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Re: [R] How export data set (available in the package) from R?

2017-07-30 Thread Suzen, Mehmet
I also suggest you Hadley's optimized package for interoperating xls
files with R:

https://github.com/tidyverse/readxl
https://cran.r-project.org/web/packages/readxl/index.html

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [Rd] matrices with names

2017-07-27 Thread Suzen, Mehmet
Not always, see what happens with lapply:
> x<-matrix(12,1,1)
> names(x)<-"one"
> y<-matrix(1,1,1)
> names(y)<-"one"
> dput(lapply(x,`+`,e2=y))
structure(list(one = structure(13, .Dim = c(1L, 1L))), .Names = "one")
>dput(lapply(x,`+`,e2=1))
structure(list(one = 13), .Names = "one")


Prof. Ripley has pointed out this some time ago:
https://stat.ethz.ch/pipermail/r-devel/2011-October/062297.html

Note that ?`+` tells:

 The rules for determining the attributes of the result are rather
 complicated.  Most attributes are taken from the longer argument.
 Names will be copied from the first if it is the same length as
 the answer, otherwise from the second if that is.  If the
 arguments are the same length, attributes will be copied from
 both, with those of the first argument taking precedence when the
 same attribute is present in both arguments. For time series,
 these operations are allowed only if the series are compatible,
 when the class and ‘tsp’ attribute of whichever is a time series
 (the same, if both are) are used.  For arrays (and an array
 result) the dimensions and dimnames are taken from first argument
 if it is an array, otherwise the second.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: A tool to generate simulation data

2017-07-27 Thread Suzen, Mehmet
I suggest RandomRDDs API. It provides nice tools. If you write
wrappers around that might be good.

https://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.mllib.random.RandomRDDs$

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Re: [R] R package for scorecard development

2017-06-29 Thread Suzen, Mehmet
I suggest you to have a look at this R document:
https://cran.r-project.org/doc/contrib/Sharma-CreditScoring.pdf

On 28 June 2017 at 13:26, Nikhil Abhyankar  wrote:
> Hello all,
>
> Is there any R package that can develop a scorecard model for a binary
> target variable?
>
> More details:
> I want to create a scorecard based on the raw data I have.
>
> I have a binary target variable and a few numeric and character input
> variables.
>
> I want to bin the variables and assign a score to each of the bins.
>
> Each subject will be scored based on the bin it falls in for each variable.
>
> All such scores from each of the variables will be added up to get the
> final score.
>
> There will be a cutoff score to decide which of the two classes of response
> the subject falls into.
>
> I fount and tested the smbinning package. However, it only gives the bins
> for a single variable at a time.
>
> How can I get a full scorecard model?
>
> Thanks
> Nikhil
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Nash equilibrium and other game theory tools implemented in networks using igraph or similar

2017-06-28 Thread Suzen, Mehmet
Hello Chris,

I was implying you are capable enough to implement it, while you have
already identify a research paper. If there is no package out there,
uploading to CRAN would help future user too. I am more than happy to
help if you want to implement from scratch.

Best,
Mehmet

On 27 June 2017 at 17:45, Chris Buddenhagen  wrote:
> Does anyone know of some code, and examples that implement game theory/Nash
> equilibrium hypothesis testing using existing packages like igraph/statnet
> or similar?
>
> Perhaps along the lines of this article:
>
> Zhang, Y., Aziz-Alaoui, M. A., Bertelle, C., & Guan, J. (2014). Local Nash
> Equilibrium in Social Networks, *4*, 6224.
>
> Best,
> Chris Buddenhagen
> cbuddenha...@gmail.com
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Nash equilibrium and other game theory tools implemented in networks using igraph or similar

2017-06-27 Thread Suzen, Mehmet
Why don't you implement and uplad the package to CRAN?

On 27 Jun 2017 17:45, "Chris Buddenhagen"  wrote:

Does anyone know of some code, and examples that implement game theory/Nash
equilibrium hypothesis testing using existing packages like igraph/statnet
or similar?

Perhaps along the lines of this article:

Zhang, Y., Aziz-Alaoui, M. A., Bertelle, C., & Guan, J. (2014). Local Nash
Equilibrium in Social Networks, *4*, 6224.

Best,
Chris Buddenhagen
cbuddenha...@gmail.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: Do we anything for Deep Learning in Spark?

2017-06-21 Thread Suzen, Mehmet
There is a BigDL project:
https://github.com/intel-analytics/BigDL

On 20 June 2017 at 16:17, Jules Damji  wrote:
> And we will having a webinar on July 27 going into some more  details. Stay
> tuned.
>
> Cheers
> Jules
>
> Sent from my iPhone
> Pardon the dumb thumb typos :)
>
> On Jun 20, 2017, at 7:00 AM, Michael Mior  wrote:
>
> It's still in the early stages, but check out Deep Learning Pipelines from
> Databricks
>
> https://github.com/databricks/spark-deep-learning
>
> --
> Michael Mior
> mm...@apache.org
>
> 2017-06-20 0:36 GMT-04:00 Gaurav1809 :
>>
>> Hi All,
>>
>> Similar to how we have machine learning library called ML, do we have
>> anything for deep learning?
>> If yes, please share the details. If not then what should be the approach?
>>
>> Thanks and regards,
>> Gaurav Pandya
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/Do-we-anything-for-Deep-Learning-in-Spark-tp28772.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> -
>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>>
>

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Re: [R] Latin Hypercube Sampling when parameters are defined according to specific probability distributions

2017-06-01 Thread Suzen, Mehmet
No it is an R programming questions.  Nelly specifically asked you:

"how can I use your code to apply my model to each of the 50 rows of
the data frame “tabLHS”?"

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] (Somewhat?) Off topic: Containerization software

2017-06-01 Thread Suzen, Mehmet
This is a nice summary addressing the same with R:
https://arxiv.org/pdf/1410.0846.pdf

On 30 May 2017 at 17:43, Bert Gunter  wrote:
> Folks:
>
> This is **off topic**, but I thought it might be informative to this
> community. Consequently: please **no on list public comments or
> discussion**. Feel free to respond to me privately, if you like; but I
> have neither knowledge nor opinions, so why bother? This is just FYI.
> My apology if it is deemed inappropriate.
>
> http://www.nature.com/news/software-simplified-1.22059
>
> Cheers,
> Bert
>
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [Rd] as.POSIXct behaviour

2017-03-08 Thread Suzen, Mehmet
On 9 March 2017 at 01:29, Arunkumar Srinivasan
 wrote:
> The time info is lost on the first index as well. And it happens *silently*.

Yes, because it assumes homogeneous format on the entire vector.  You
may want to
do  two passes with different formats or run a regular expression to
catch typos.

> lapply(x, as.POSIXct)
> A list is returned, but values are as I’d expect. Would it be possible
> to retain the time info with as.POSIXct(x) directly as well? If not,

This behaves differently because you are running as.POSIXct for each element
separately.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

partition size inherited from parent: auto coalesce

2017-01-16 Thread Suzen, Mehmet
Hello List,

 I was wondering what is the design principle that partition size of
an RDD is inherited from the parent.  See one simple example below
[*]. 'ngauss_rdd2' has significantly less data, intuitively in such
cases, shouldn't spark invoke coalesce automatically for performance?
What would be the configuration option for this if there is any?

Best,
-m

[*]
// Generate 1 million Gaussian random numbers
import util.Random
Random.setSeed(4242)
val ngauss = (1 to 1e6.toInt).map(x=>Random.nextGaussian)
val ngauss_rdd = sc.parallelize(ngauss)
ngauss_rdd.count // 1 million
ngauss_rdd.partitions.size // 4
val ngauss_rdd2 = ngauss_rdd.filter(x=>x > 4.0)
ngauss_rdd2.count // 35
ngauss_rdd2.partitions.size // 4

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



partition size inherited from parent: auto coalesce

2017-01-16 Thread Suzen, Mehmet
Hello List,

 I was wondering what is the design principle that partition size of
an RDD is inherited from the parent.  See one simple example below
[*]. 'ngauss_rdd2' has significantly less data, intuitively in such
cases, shouldn't spark invoke coalesce automatically for performance?
What would be the configuration option for this if there is any?

Best,
-m

[*]
// Generate 1 million Gaussian random numbers
import util.Random
Random.setSeed(4242)
val ngauss = (1 to 1e6.toInt).map(x=>Random.nextGaussian)
val ngauss_rdd = sc.parallelize(ngauss)
ngauss_rdd.count // 1 million
ngauss_rdd.partitions.size // 4
val ngauss_rdd2 = ngauss_rdd.filter(x=>x > 4.0)
ngauss_rdd2.count // 35
ngauss_rdd2.partitions.size // 4

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Re: [R] Social Network Simulation

2016-04-18 Thread Suzen, Mehmet
Dear Professor Haenlein,
Have you solved this issue yet? I found this eally interesting problem
I was wondering if it is possible to wrapper "objective function"
around igraph's 'sample_pa' and
'sample_smallworld'. If you have an example data set, I can have a look at this.
Viele Gruesse aus London
Mehmet

On 16 April 2016 at 14:16, Michael Haenlein  wrote:
> Dear all,
>
> I am trying to simulate a series of networks that have characteristics
> similar to real life social networks. Specifically I am interested in
> networks that have (a) a reasonable degree of clustering (as measured by
> the transitivity function in igraph) and (b) a reasonable degree of degree
> polarization (as measured by the average degree of the top 10% nodes with
> highest degree divided by the overall average degree).
>
> Right now I am using two functions from irgaph (sample_pa and
> sample_smallworld) but these are not ideal since they only allow me to vary
> one of the two characteristics. Either the network has good clustering but
> not enough polarization or the other way round.
>
> I looked around and I found some network algorithms that solve the problem
> (E.g., Jackson and Rogers, Meeting Strangers and Friends of Friends), but I
> did not find their implemented in an R package. I also found the R package
> NetSim which seems to be in this spirit, but I cannot get it to work.
>
> Could anyone point me to an R library that I could check out? I do not care
> much about the specific algorithm used as long as it allows me to vary
> clustering and degree polarization in certain ranges.
>
> Thanks,
>
> Michael
>
>
> Michael Haenlein
> Professor of Marketing
> ESCP Europe, Paris
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [Rd] Inconsistency when naming a vector

2015-04-27 Thread Suzen, Mehmet
There is no inconsistency. Documentation of `names` says ...value
should be a character vector of up to the same length as x...
In the first definition your character vector is not the same length
as length of x, so you enforce NA by not defining value[2]

x - 1:2
value-c(a)
value[2]
[1] NA

where as in the second case, R uses default value , from `names`
documentation ..The name  is special: it is used to indicate that
there is no name associated with an element.. Since you defined the
first one, it internally assigns  to non-defined names to match the
length of the vector.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [gmx-users] Using thermostats to create temperature gradient in system

2015-04-26 Thread Suzen, Mehmet
Hi Agnivo,

Temperature gradient means non-equilibrium MD (NEMD)

See notes from Prof. Martini:
https://nanohub.org/resources/7582/download/Martini_L10_NonequilibruimMD.pdf

What observable would you like to measure? Lets say you want to
measure observable A.

One procedure I can think of:

1.  You can run all system at T=300K with NVT. After equilibrium,
continue running all system at T=400K with freezing solvent group.
See, to how to freeze a group,
http://manual.gromacs.org/online/mdp_opt.html#neq

To get a good statistic, you need to repeat this procedure K times as
solvent thermal wall configuration  would be
different at each simulation.

100ps? Where did you get that? You need to see how observable A
behaves to see if 100ps is enough, A time average. And see how A
behaves in K different cases.


2. Similar to previous procedure but this time after freezing solvent
group, run system in NVE. Again, you need to repeat this
many times to get a good statistic. Similarly monitor observable A to
see if 500ps is enough.

However,  I suggest you to talk to someone having NEMD experience as
this procedure is just a heuristic approach.

-m
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.


Re: [gmx-users] Using thermostats to create temperature gradient in system

2015-04-26 Thread Suzen, Mehmet
Hi Agnivo;

Yes, the idea of freezing the solvent or the protein in many times is
to sample the non-equilibrium thermal process. It will remain in the
target temperature on average, over many fixed configurations obtained
by different NVT runs (equilibrium runs). But you may need to run this
many K times to get a good statistic as I said. Concerning NVE, yes,
do the same fixing approach with the protein. But as I mentioned
earlier, this is quite heuristic approach, I suggest you to talk with
any faculty member having background in non-equilibrium processes if
this approach mimics what you need.

Best,
-m
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.


Re: [R] lm() funtion

2015-04-24 Thread Suzen, Mehmet
try lm.ridge from MASS package.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Multinomial Fitting Distrbution

2015-04-22 Thread Suzen, Mehmet
mixtools package has mixture of Gaussian fitting, maybe that might help?

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Random Forest in Caret

2015-04-22 Thread Suzen, Mehmet
Can you post your memory profile and codes?

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Cost-effectiveness Analysis using R

2015-04-14 Thread Suzen, Mehmet
Yes.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] any way to write sas7bdat with R

2015-04-14 Thread Suzen, Mehmet
I didn't try this but there is an experimental package from Dr. Shotwell.
http://cran.r-project.org/web/packages/sas7bdat/index.html
if it can read, maybe you can modify to write as well?

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Cost-effectiveness Analysis in R

2015-04-14 Thread Suzen, Mehmet
Do you have specific example that you have tried to implement in R?
Can you post your codes too?

There are high quality package BCEA and BayesTree, that could be helpful;
http://cran.r-project.org/web/packages/BCEA/index.html
http://cran.r-project.org/web/packages/BayesTree/index.html

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [Numpy-discussion] IDE's for numpy development?

2015-04-08 Thread Suzen, Mehmet
 Spyder supports C.

Thanks for correcting this. I wasn't aware of it.
How was your experience with it?

Best,
-m
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] IDE's for numpy development?

2015-04-06 Thread Suzen, Mehmet
Hi Chuck,

Spider is good. If you are coming from Matlab world.

http://spyder-ide.blogspot.co.uk/

I don't think it supports C. But Maybe you are after Eclipse.

Best,
-m
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [R] glmnet: converting coefficients back to original scale

2015-04-03 Thread Suzen, Mehmet
This is interesting, can you post your lm.ridge solution as well?  I
suspect in glmnet, you need to use model.matrix with intercept, that
could be the reason.

-m

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] weighted network centrality measures by network size

2014-08-07 Thread Suzen, Mehmet
Hi Jenny,

Have you tried igraph before?  See, http://igraph.org/r/doc/
There are couple of centrality measures there.

Best,
-m



On 6 August 2014 02:50, Jenny Jiang jiangyun...@y7mail.com wrote:
 Dear R-help,

 My name is Jenny Jiang and I am a Finance Honours research
  student from the University of New South Wales Australia. Currently my
 research project involves the calculating of some network centrality
 measures in R, which are degree, closeness, betweenness and eigenvector. 
 However I am having some issue regarding to the calculation of
 the weighted centrality measures by network size. For example, currently
  my code allows me to calculate centrality measures for each firm year,
 and now I would like to calculate centrality measures weighted by the
 firm network size for each firm year.

 My current code is like the following:

 install.packages(statnet)

 library(statnet)

 #read csv
 data - read.csv(D:\\Users\\z3377013\\Desktop\\networknew1.csv,header=TRUE)
 #companies - unique(data$CompanyID_)
 #years - unique(data$Year)
 pairs - unique(data[,c(1,3)])
 #directors - unique(c(data$DirectorID_,data$DirectorID_Connected))
 #director_map - 1:length(directors)
 #names(director_map) - c(as.character(directors))

 #for (i in 1:nrow(data)) {
 #  data[i,2] = director_map[as.character(data[i,2])]
 #  data[i,4] = director_map[as.character(data[i,4])]
 #}

 sink(D:\\Users\\z3377013\\Desktop\\measure1.csv)
 for (i in 1:nrow(pairs)) {
   d - subset(data, CompanyID_==pairs[i,1]Year==pairs[i,2])
   directors - unique(c(d$DirectorID_,d$DirectorID_Connected))
   director_map - 1:length(directors)
   names(director_map) - c(as.character(directors))
   for (j in 1:nrow(d)) {
 d[j,2] = director_map[as.character(d[j,2])]
 d[j,4] = director_map[as.character(d[j,4])]
   }

   net-network(d[,c(2,4)],directed=F,loops=F,matrix.type=edgelist)

   degree - degree(net, cmode=freeman, gmode=graph)
   closeness - closeness(net,gmode=graph,cmode=undirected)
   betweenness - betweenness(net,gmode=graph,cmode=undirected)
   evcent - evcent(net,gmode=graph,use.eigen=TRUE)

   write.csv(cbind(pairs[i,], directors, degree, closeness, betweenness, 
 evcent), row.names=FALSE)
 }
 sink()

 And an example of my data structure is like the following:

 CompanyID_DirectorID_YearDirectorID_Connected
 900370006802120033699838021
 900370041803220033699838021
 900370059803220033699838021
 900370089803220033699838021
 900370347806320033699838021
 900370362806320033699838021
 900370383806320033699838021
 900370399806320033699838021
 900369983802120033700068021
 900370041803220033700068021
 900370059803220033700068021
 900370089803220033700068021
 900370347806320033700068021
 900370362806320033700068021
 900370383806320033700068021
 900370399806320033700068021
 900369983802120033700418032
 900370006802120033700418032
 900370059803220033700418032
 900370006802120043699838021
 900370041803220043699838021
 900370059803220043699838021
 900370089803220043699838021
 900370347806320043699838021
 129016045381142003427207466
 129035569064722003427207466
 129037011080322003427207466
 129037084581042003427207466
 129037084781042003427207466
 129037112481352003427207466
 1290101671106122003427207466
 1290102718113832003427207466

 where for each firm-year I have a list of directors and their corresponding 
 connected directors within that firm-year.

 If you could
 provide me the R code regarding to how to calculate the weighted measures by 
 network size that that would be really
 helpful.

 I cannot be more than appreciated.

 Best regards

 Jenny

 [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using R to analyze multiple MRI studies

2014-07-05 Thread Suzen, Mehmet
Did you inspect the CRAN view for Medical imaging?
http://cran.r-project.org/web/views/MedicalImaging.html

On 3 July 2014 17:09, moleps islon mole...@gmail.com wrote:
 I need to analyze multiple T1 contrast enhanced MRI studies from different
 patients. They are all in DICOM format. I see that there are different
 packages for loading individual studies in DICOM format, however I have had
 limited luck so far researching how the different studies can be tranformed
 into MNI or Talairach space. Is there an R-implementation of this?

 Best,

 M

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R-SIG-Finance] Kalman Filter Implementation in R

2014-06-16 Thread Suzen, Mehmet
I suggest you to read the paper by Fernando Tusell from University of
Basque Country,
Kalman Filtering in R, JSS Vol. 39, Issue 2, Mar 2011


On 16 June 2014 11:21, Manuj Goel mg...@st-andrews.ac.uk wrote:
 Hello everyone,

 I am an applied statistics post-graduate student and am doing my
 dissertation on kalman filters and its application on financial models. I
 have read quite a lot papers on kalman filters and I am able to understand
 their methodology. But I am unable to work my way through to build a basic
 Kalman model in R. Can someone help me with this please. Any and all help
 really appreciated. Thanks.

 Kind Regards,
 M

 [[alternative HTML version deleted]]

 ___
 R-SIG-Finance@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-sig-finance
 -- Subscriber-posting only. If you want to post, subscribe first.
 -- Also note that this is not the r-help list where general R questions 
 should go.

___
R-SIG-Finance@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-finance
-- Subscriber-posting only. If you want to post, subscribe first.
-- Also note that this is not the r-help list where general R questions should 
go.


Re: [R] Logistic Regression

2014-06-14 Thread Suzen, Mehmet
You might want to read this vignette:
http://cran.r-project.org/web/packages/HSAUR/vignettes/Ch_logistic_regression_glm.pdf

On 14 June 2014 19:53, javad bayat j.bayat...@gmail.com wrote:
 Dear all, I have to use Zelig package for doing logistic regression.
 How can I use Zelig package for logistic regression?

 I did this code by glm function:

 glm1 = glm(kod~Curv+Elev+Out.c+Slope+Aspect,data=data,
family=binomial)
 summary(glm1)

 But the results were not appropriate for my data.

 Many thanks for your helps.







 --
 Best Regards
 Javad Bayat
 M.Sc. Environment Engineering
 Shahid Beheshti (National) University (SBU)
 Alternative Mail: bayat...@yahoo.com

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Defining default method for S3, S4 and R5 classes

2014-06-14 Thread Suzen, Mehmet
There is a nice tutorial on this:
http://adv-r.had.co.nz/OO-essentials.html

For an in depth guide, have a look at the book from John Chambers,
Software for data analysis programming with R.

On 13 June 2014 12:20, Luca Cerone luca.cer...@gmail.com wrote:
 Dear all,

 I am writing a script implementing a pipeline to analyze some of the
 data we receive.

 One of the steps in this pipeline involves clustering the data, and I
 am interested
 in studying the effects of different clustering algorithms on the final 
 results.

 I am having issues making my code general enough because the
 clustering algorithms we are interested all return different types of
 objects (S3, S4 and R5 classes, as well as simple named lists).

 From the output of these algorithms I need to extract a list with as many
 elements as the number of clusters and such that each element contains the ids
 of the elements in each cluster.

 I have easily done this for each of the cluster algorithms,
 the problem is: how can I make so that rather than having to check for
 classes and
 types this is done automatically?

 For example, for the algorithms that return S3 classes I have defined
 a method get_cluster_list.default and then created the methods for
 the individual classes, which is used in the main body of the
 pipeline.

 I have no idea how I can do this for S4 and R5 classes and,  more
 importantly, I would
 like an approach that works when using all S3, S4 and R5 classes.

 Do you know how I could do this?

 Thanks for the help,
 Luca

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] copula fitting

2014-06-10 Thread Suzen, Mehmet
yes you can.

On 7 June 2014 16:04, mudit gupta muditf...@gmail.com wrote:
 Hi guys,

 can i fit  a copula to two marginal  distributions with different sample
 size?
 like one has 2340 observations and other has 1912.

 thanks
 Mudit

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] copula fitting

2014-06-10 Thread Suzen, Mehmet
Have you checked out 'copula' package?

On 11 June 2014 00:36, Suzen, Mehmet msu...@gmail.com wrote:
 yes you can.

 On 7 June 2014 16:04, mudit gupta muditf...@gmail.com wrote:
 Hi guys,

 can i fit  a copula to two marginal  distributions with different sample
 size?
 like one has 2340 observations and other has 1912.

 thanks
 Mudit

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to Optimize two functions together in R

2014-05-19 Thread Suzen, Mehmet
Use defaul values initially, to see if you got reasonable results. See
here for the details of nsga2
http://dx.doi.org/10.1109/4235.996017

On 19 May 2014 16:42, Mingxuan Han han...@purdue.edu wrote:
 I try to use NSGAII function in the mco but I am kind of confusing about the
 numbers of input and output dimension in the argument. Could you explain a
 little about this? Thank you.



 --
 View this message in context: 
 http://r.789695.n4.nabble.com/How-to-Optimize-two-functions-together-in-R-tp4690654p4690832.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to Optimize two functions together in R

2014-05-18 Thread Suzen, Mehmet
This deals with the multi-objective optimisation.
Try MCO and emoa packages.

http://stats.stackexchange.com/questions/77580/optimization-of-multiple-objective-functions-with-constraints


On 15 May 2014 17:47, Mingxuan Han han...@purdue.edu wrote:
 I am trying to minimize two functions with same set of parameter(x,y).
 Currently I can only use optim() to minimize the each function one by one.
 Is there any solution I can use to optimize two functions with same set of
 (X,Y) together.



 --
 View this message in context: 
 http://r.789695.n4.nabble.com/How-to-Optimize-two-functions-together-in-R-tp4690654.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Calculating transition probabilties

2014-05-13 Thread Suzen, Mehmet
This looks like this is your homework about Markov chains. not an R
question actually.
But have a look at the markovchain package from CRAN:
http://cran.r-project.org/web/packages/markovchain/vignettes/an_introduction_to_markovchain_package.pdf

On 13 May 2014 16:49, Baba Bukar bbu...@nda.edu.ng wrote:
 Dear all,

 I am new to R and have some problem computing transition probabilities. My
 problem goes like this;

 data_set -
 c(2,0,45,6,78,3,0,2,6,0,5,8,0,2,8,9,12,212,22,4,1,0,3,5,88,5,69,12,4,0,0,0,0,4,87,6,99,104,22,7)

 observations greater than, say 3, is considered as useful (denoted as 1)
 while less than 3 are not useful (denoted as 1). Am trying to calculate the
 transition in these count data such as P_1,1=prob from useful to useful,
 P_1,0=prob from useful to not useful, P_0,1=prob from not useful to useful
 and P_0,0=prob from not useful to not useful.

 Thank you much as you respond soonest

 Kind regards
 Zakir

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problem in r-code

2014-05-09 Thread Suzen, Mehmet
Wrong list. This is an R list not Bugs.
You may want to consult Bugs materials:
http://www2.mrc-bsu.cam.ac.uk/bugs/weblinks/webresource.shtml

On 8 May 2014 11:36, thanoon younis thanoon.youni...@gmail.com wrote:
 dear all members

 is there anyone explain to me the code below and how can i transfer this
 code to winbugs program.

 q[i,1]=qnorm(runif(1,min=.5,max=1),0,1)

 thanks in advance

 thanoon

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How can I make this nested loop faster?

2014-05-09 Thread Suzen, Mehmet
Your code is not re-producable. Can you provide a working example
using a standard dataset from R?
But, you could first try to use compiler package, see ?enableJIT.
Another option would be to use  doMC/foreach packages if you can run
your assignment in the nested loop in parallel, see %dopar%.

On 8 May 2014 21:59, Ludwig Hilger l.hil...@ku.de wrote:
 Hello everybody,
 I have written a nested for-loop, but as length(uc)  170,000, this would
 take VERY long. I have tried to use sapply or something but I cannot get it
 to work, I would be happy if someone could point out to write this more
 efficiently. Thank you all,

 Ludwig

 ergsens - data.frame(budget = numeric(500))
 uc - unique(rftab$startCell)

 for(i in 1:500){
 uniquerates - rlnorm(n = length(uc), mean = -1.6, sd = 1.7)
 for(j in 1:length(uc)){
 rftab$masskg[rftab$startCell == uc[j]] - uniquerates[j]
 }
 ergsens$budget[i] - sum(rftab$masskg, na.rm = TRUE)/1000
 }





 -
 Dipl. Geogr. Ludwig Hilger
 Wiss. MA
 Lehrstuhl für Physische Geographie
 Katholische Universität Eichstätt-Ingolstadt
 Ostenstraße 18
 85072 Eichstätt
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/How-can-I-make-this-nested-loop-faster-tp4690209.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] uniform number

2014-05-05 Thread Suzen, Mehmet
WTF?

Is that a R package from you?



On 5 May 2014 09:27, Rolf Turner r.tur...@auckland.ac.nz wrote:
 On 05/05/14 17:05, Ragia Ibrahim wrote:


 Dear group,
 How to generate  uniform probability choosing p to be 2% and 5%, in
 separate trials for 100 times.


 No idea WTF you are talking about.  Can you formulate a question that is
 comprehensible to the human mind?

 cheers,

 Rolf Turner

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] uniform number

2014-05-05 Thread Suzen, Mehmet
That paper you cite is about Social networks. You may want to use
igraph or sna packages

On 5 May 2014 10:54, Ragia Ibrahim ragi...@hotmail.com wrote:
 thanks for replying

 in the following paper
 http://www.cs.cornell.edu/home/kleinber/kdd03-inf.pdf
 page 6 third paragraph


 the author writes:
 assigned a uniform probability of p to each edge of the graph, choosing p
 to be 1% and 10%
 in separate trials.


 how to use R function to get such probability ?
 Regards

 Date: Mon, 5 May 2014 10:12:49 +0200
 Subject: Re: [R] uniform number
 From: msu...@gmail.com
 To: r.tur...@auckland.ac.nz
 CC: ragi...@hotmail.com; r-help@r-project.org

 WTF?

 Is that a R package from you?



 On 5 May 2014 09:27, Rolf Turner r.tur...@auckland.ac.nz wrote:
  On 05/05/14 17:05, Ragia Ibrahim wrote:
 
 
  Dear group,
  How to generate uniform probability choosing p to be 2% and 5%, in
  separate trials for 100 times.
 
 
  No idea WTF you are talking about. Can you formulate a question that is
  comprehensible to the human mind?
 
  cheers,
 
  Rolf Turner
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [gmx-users] Silica Nanoparticles

2014-04-22 Thread Suzen, Mehmet
You may try to use special potential designed for amorphous silica:

A numerical investigation of the liquid–vapor coexistence curve of silica
Yves Guissani and Bertrand Guillot
J. Chem. Phys. 104, 7633 (1996)



On 22 April 2014 17:12, Kazem Sepehrinia ksepehri...@gmail.com wrote:
 Hi Dear All,

 Have any of you guys prepared silica nanoparticles in your Molecular
 Simulation studies? I used some open databases for obtaining of bulk
 amorphous silica file. But i'm not able to prepare silica nanoparticles.
 Once again i used materials studio glasses and made a nanoparticle but that
 one is not working also. Because i tried to minimize it and job failed. Any
 help would be greatly appreciated.

 Thanks,
 Kazem Sepehrinia.
 --
 Gromacs Users mailing list

 * Please search the archive at 
 http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

 * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

 * For (un)subscribe requests visit
 https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
 mail to gmx-users-requ...@gromacs.org.
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.


Re: [R] inverse normal distribution function

2014-04-19 Thread Suzen, Mehmet
You may want to read about generalized linear modelling and link
functions for forming appropriate categorical variable/link function.
See documentations in R:  ?glm, ?family and ?inverse.gaussian.  Also
look at the original paper of Nelder, John; Wedderburn, Robert , it is
available freely with the courtesy of JSTOR:
http://www.jstor.org/discover/10.2307/2344614

On 18 April 2014 09:13, thanoon younis thanoon.youni...@gmail.com wrote:
 dear all members

 i want to use inverse normal distribution in R to show the value of
 variable Z when Z represent the ordered categorical variables. i hope
 anyone gives me an example on this distribution
 .

 thanks to all

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] inverse normal distribution function

2014-04-19 Thread Suzen, Mehmet
Not sure how would you do that but there is a package SEM on CRAN for
structural equation models.

On 20 April 2014 01:10, thanoon younis thanoon.youni...@gmail.com wrote:
 thank you so much Suzen
 i want to use bayesian analysis in structural equation models with ordered
 categorical data and i want to use inverse normal as a distribution of
 thresholds and i  dont find any paper or documents in R or another program
 about inverse normal.

 best regards


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] operating website through R

2014-04-12 Thread Suzen, Mehmet
This looks not so elegant, while normally data provider must have a
nice accessing API, anyway,  for example you can do this:

 myAdd - 
 'http://disc2.nascom.nasa.gov/daac-bin/Giovanni/tovas/Giovanni_cgi.pl?west=60north=50east=70south=-50params=0|3B42_V7plot_type=Area+Plotbyr=2014bmo=01bdy=31eyr=2014emo=01edy=31begin_date=1998%2F01%2F01end_date=2014%2F01%2F31cbar=cdyncmin=cmax=yaxis=ydynymin=ymax=yint=ascres=0.25x0.25global_cfg=tovas.global.cfg.plinstance_id=TRMM_V7prod_id=3B42_dailyaction=ASCII+Output'

 myData - read.table(myAdd, skip=5, header=T)

Will give you this:

 str(myData)
'data.frame':16441 obs. of  3 variables:
 $ Latitude : num  -50 -50 -50 -50 -50 -50 -50 -50 -50 -50 ...
 $ Longitude: num  60 60.2 60.5 60.8 61 ...
 $ AccRain  : num  0 0.42 0.39 0.42 0.66 1.23 2.31 2.37 3.72 3.63 ...

For choosing different parameters, for example in case of coordinates,
you just need to change the values in 'myAdd' parameters after
Giovanni_cgi.pl?, west, north, east, south. But you must be sure that
there is a data available with those parameters, no magical error
control here.

On 11 April 2014 17:45, eliza botto eliza_bo...@hotmail.com wrote:
 Dear Suzen,

 I couldn't understand. Could you please elaborate it with a small example?

 :(

 Thanks in advance.

 Eliza

 Date: Fri, 11 Apr 2014 17:31:18 +0200
 Subject: Re: [R] operating website through R
 From: msu...@gmail.com
 To: eliza_bo...@hotmail.com
 CC: r-help@r-project.org


 You just need to pass the parameters on Giovanni_cgi.pl with
 action=ASCII+Output

 On 11 April 2014 17:19, eliza botto eliza_bo...@hotmail.com wrote:
  Dear Users of R,
  I wanted to operate certain slots of this website
  (http://disc2.nascom.nasa.gov/Giovanni/tovas/TRMM_V7.3B42_daily.2.shtml)
  through R. I wanted to operate Latitude, longitude section, plot type, 
  begin
  and end year and ASCII Output Resolution. The filling of these slot will
  produce and output file with I want to D/L at a certain location in my PC.
  I have a matrix of 2 columns and 3000 rows which contain Latitude and
  Longitude information which i want to upload automatically in the slots of
  website. I tried to use certain web scarping techniques in R but to no use.
  Is there a way of doing it in R.
  thank you very much in advance,
  Eliza
 
 
  [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] operating website through R

2014-04-11 Thread Suzen, Mehmet
You just need to pass the parameters on Giovanni_cgi.pl with action=ASCII+Output

On 11 April 2014 17:19, eliza botto eliza_bo...@hotmail.com wrote:
 Dear Users of R,
 I wanted to operate certain slots of this website 
 (http://disc2.nascom.nasa.gov/Giovanni/tovas/TRMM_V7.3B42_daily.2.shtml) 
 through R. I wanted to operate Latitude, longitude section, plot type, begin 
 and end year and ASCII Output Resolution. The filling of these slot will 
 produce and output file with I want to D/L at a certain location in my PC.
 I have a matrix of 2 columns and 3000 rows which contain Latitude and 
 Longitude information which i want to upload automatically in the slots of 
 website. I tried to use certain web scarping techniques in R but to no use.
 Is there a way of doing it in R.
 thank you very much in advance,
 Eliza


 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sampling according to type

2014-03-05 Thread Suzen, Mehmet
If I understood correctly, you need weighted sampling. Try 'prob'
argument from 'sample'.  For your example:

n - 10
ntype - rbinom(n, 1, 0.5)
myProbs - rep(1/10, 10) # equally likely
myProbs[ which(ntype == 0)] - 0.75/7 # Divide so the sum will be 1.0
myProbs[ which(ntype == 1)] - 0.25/3
sample(ntype,3, prob=myProbs)




On 5 March 2014 15:20, Thomas thomas.ches...@nottingham.ac.uk wrote:
 I have a matrix where each entry represents a data subject's type, 1 or 0:

 n - 10
 ntype - rbinom(n, 1, 0.5)

 and I'd like to sample say 3 subjects from ntype where those subjects who
 are Type 1 are selected with probability say 0.75, and Type 0 with (1-0.75).
 (So the sample would produce a list with three indices each referring to a
 position within ntype.)

 Can anyone suggest a way to do this please?

 Thank you,

 Thomas Chesney
 This message and any attachment are intended solely for the addressee and
 may contain confidential information. If you have received this message in
 error, please send it back to me, and immediately delete it.   Please do not
 use, copy or disclose the information contained in this message or in any
 attachment.  Any views or opinions expressed by the author of this email do
 not necessarily reflect the views of the University of Nottingham.

 This message has been checked for viruses but the contents of an attachment
 may still contain software viruses which could damage your computer system,
 you are advised to perform your own checks. Email communications with the
 University of Nottingham may be monitored as permitted by UK legislation.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sampling according to type

2014-03-05 Thread Suzen, Mehmet
 myProbs[ which(ntype == 0)] - 0.75/7 # Divide so the sum will be 1.0
 myProbs[ which(ntype == 1)] - 0.25/3

Here of course you need to divide by number of 0s and 1s,  7 and 3
were was just an example.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Shortest connected path in a matrix

2014-03-05 Thread Suzen, Mehmet
You may want to check bioconductor packages doing graph algorithms.
Maybe this one:
http://www.bioconductor.org/packages/release/bioc/manuals/RBGL/man/RBGL.pdf
See  for example ?dijkstra.sp

On 5 March 2014 18:44, McCloskey, Bryan bmcclos...@usgs.gov wrote:
 Here is some example data (hopefully the monospace formatting is preserved):

 a   b   c   d   e
 -   -   -   -   -
 1 | F | T | F | T | F |
 -   -   -   -   -
 2 | T | F | T | F | T |
 -   -   -   -   -
 3 | T | T | F | F | F |
 -   -   -   -   -
 4 | F | T | F | T | F |
 -   -   -   -   -
 5 | F | T | F | F | T |
 -   -   -   -   -

 So, for cell b1, the shortest possible path to a true value in row 5 is
 b1-a2-a3-b4-b5 (distance: sqrt(2) + 1 + sqrt(2) + 1).

 * Shortest paths are not necessarily unique, but I just need to find the
 length.

 * If it's computationally hard to guarantee the absolute shortest path, I
 can probably live with nearly shortest paths.

 * Paths can backtrack, so the shortest path from cell e2 to row 4 is
 e2-d1-c2-b3-b4-b5.

 I need to calculate the shortest path for all true cells to all rows
 further down the matrix. I'm afraid I'm going to have to write some sort of
 recursive path-tracing algorithm, but I'm hoping there's a package already
 in existence that accomplishes this already...

 -bryan

 On Tue, Mar 4, 2014 at 1:13 PM, McCloskey, Bryan bmcclos...@usgs.govwrote:

 I have a binary rectangular T/F matrix; I need to be able to calculate the
 shortest path (i.e., Pythagorean distance) between a populated cell in row
 j and any populated cell in some row j+n.

 For instance, if I have a chessboard with random black/white square
 colors, I need the shortest distance (linear distance, not number of steps)
 for a king to get from a specified black space on the first row, to _any_
 black space in a specified further row, traveling only on black spaces.

 Any idea? Thanks,

 -bryan


 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [gmx-users] Computing melting point

2014-01-14 Thread Suzen, Mehmet
Lindemann criterion might be easier.  See For example,

* Materials science: Melting from within
Nature 413, 582-583 (11 October 2001) | doi:10.1038/35098169
Robert W. Cahn


On 14 January 2014 15:35, Golshan Hejazi golshan.hej...@yahoo.com wrote:
 Hello everyone!

 I would like to compute the melting point of a drug crystalline system. In 
 the literature, there exist a good number of methods to do so!
 Among them, I read Gibbs-Duhem integration technique in which one needs to 
 provide a reference coexistence of solid/liquid. I read some articles in 
 which they studied the melting point of water and also some ionic crystals. 
 But I would like to know whether you can suggest me some more materials to 
 read to find some ideas how to choose the reference structure for drug 
 crystals?

 At the end, I would like to perform this simulation using gromacs. So if you 
 think there are other methods which are more straight forward, would be great 
 to let me know.

 Best
 G.
 --
 Gromacs Users mailing list

 * Please search the archive at 
 http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

 * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

 * For (un)subscribe requests visit
 https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
 mail to gmx-users-requ...@gromacs.org.
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.


Re: [R] Season's Greetings (and great news ... )!

2013-12-22 Thread Suzen, Mehmet
I wouldn't blame R for floating-point arithmetic and our personal
feeling of what 'zero' should be.

 options(digits=20)
 pi
[1] 3.141592653589793116
 sqrt(pi)^2
[1] 3.1415926535897926719
 (pi - sqrt(pi)^2)  1e-15
[1] TRUE

There was a similar post before, for example see:
http://r.789695.n4.nabble.com/Why-does-sin-pi-not-return-0-td4676963.html

There is an example by Martin Maechler (author of Rmpfr) on how to use
arbitrary precision
with your arithmetic.

On 22 December 2013 10:59, Ted Harding ted.hard...@wlandres.net wrote:
 Greetings All!
 With the Festive Season fast approaching, I bring you joy
 with the news (which you will surely wish to celebrate)
 that R cannot do arithmetic!

 Usually, this is manifest in a trivial way when users report
 puzzlement that, for instance,

   sqrt(pi)^2 == pi
   # [1] FALSE

 which is the result of a (generally trivial) rounding or
 truncation error:

   sqrt(pi)^2 - pi
   [1] -4.440892e-16

 But for some very simple calculations R goes off its head.

 I had originally posted this example some years ago, but I
 have since generalised it, and the generalisation is even
 more entertaining than the original.

 The Original:
 Consider a sequence generated by the recurrence relation

   x[n+1] = 2*x[n] if 0 = x[n] = 1/2
   x[n+1] = 2*(1 - x[n]) if 1/2  x[n] = 1

 (for 0 = x[n] = 1).

 This has equilibrium points (x[n+1] = x[n]) at x[n] = 0
 and at x[n] = 2/3:

   2/3 - 2*(1 - 2/3) = 2/3

 It also has periodic points, e.g.

   2/5 - 4/5 - 2/5 (period 2)
   2/9 - 4/9 - 8/9 - 2/9 (period 3)

 The recurrence relation can be implemented as the R function

   nextx - function(x){
 if( (0=x)(x=1/2) ) {x - 2*x} else {x - 2*(1 - x)}
   }

 Now have a look at what happens when we start at the equilibrium
 point x = 2/3:

   N - 1 ; x - 2/3
   while(x  0){
 cat(sprintf(%i: %.9f\n,N,x))
 x - nextx(x) ; N - N+1
   }
   cat(sprintf(%i: %.9f\n,N,x))

 Run that, and you will see that successive values of x collapse
 towards zero. Things look fine to start with:

   1: 0.7
   2: 0.7
   3: 0.7
   4: 0.7
   5: 0.7
   ...

 but, later on,

   24: 0.7
   25: 0.6
   26: 0.8
   27: 0.4
   28: 0.66672
   ...

   46: 0.667968750
   47: 0.664062500
   48: 0.671875000
   49: 0.65625
   50: 0.68750
   51: 0.62500
   52: 0.75000
   53: 0.5
   54: 1.0
   55: 0.0

 What is happening is that, each time R multiplies by 2, the binary
 representation is shifted up by one and a zero bit is introduced
 at the bottom end. To illustrate this, do the calculation in
 7-bit arithmetic where 2/3 = 0.1010101, so:

 0.1010101  x[1], 1/2 so subtract from 1 = 1.000 - 0.0101011,
 and then multiply by 2 to get x[2] = 0.1010110. Hence

 0.1010101  x[1] - 2*(1 - 0.1010101) = 2*0.0101011 -
 0.1010110  x[2] - 2*(1 - 0.1010110) = 2*0.0101010 -
 0.1010100  x[3] - 2*(1 - 0.1010100) = 2*0.0101100 -
 0.1011000  x[4] - 2*(1 - 0.1011000) = 2*0.0101000 -
 0.101  x[5] - 2*(1 - 0.101) = 2*0.011 -
 0.110  x[6] - 2*(1 - 0.110) = 2*0.010 -
 0.100  x[7] - 2*0.100 = 1.000 -
 1.000  x[8] - 2*(1 - 1.000) = 2*0 -
 0.000  x[9] and the end of the line.

 The final index of x[i] is i=9, 2 more than the number of binary
 places (7) in this arithmetic, since 8 successive zeros have to
 be introduced. It is the same with the real R calculation since
 this is working to .Machine$double.digits = 53 binary places;
 it just takes longer (we reach 0 at x[55])! The above collapse
 to 0 occurs for any starting value in this simple example (except
 for multiples of 1/(2^k), when it works properly).

 Generalisation:
 This is basically the same, except that everything is multiplied
 by a scale factor S, so instead of being on the interval [0,1].
 it is on [0,S], and

   x[n+1] = 2*x[n] if 0 = x[n] = S/2
   x[n+1] = 2*(S - x[n]) if S/2  x[n] = S
 (for 0 = x[n] = S).

 Again, x[n] = 2*S/3 is an equilibrium point. 2*S/3  S/2, so

   x[n] - 2*(S - 2*S/3) = 2*(S/3) = 2*S/3

 Functions to implement this:

   nxtS - function(x,S){
 if((x = 0)(x = S/2)){ x- 2*x } else {x - 2*(S-x)}
   }

   S - 6 ##  Or some other value of S
   Nits - 100
   x - 2*S/3
   N - 1 ; print(c(N,x))
   while(x0){
   if(N  Nits) break   ### to stop infinite looping
   N - (N+1) ; x - nxtS(x,S)
   print(c(N,x))
 }

 The behaviour of the sequence now depends on the value of S.

 If S is a multiple of 3, then with x[1] = 2*S/3 the equilibrium
 is immediately attained and x[n] = 2*S/3 forever after, since
 R is now calculating with integers. E.g. try the above with S-6
 That is what arithmetic ought to be like! But for S not a multiple
 of 3 one can get the impression that R is on some sort of drug!

 For other values of S (but not all) we observe the same collapse
 to x=0 as before, and again it takes 54 steps (ending with x[55]).
 Try e.g. S - 16

 For some values of S, however, the iteration ends up in a 

Re: [R] Functional Programming patterns

2013-11-20 Thread Suzen, Mehmet
Have you checked the r.lambda package of Brian Lee Yung Rowe
?

http://cran.r-project.org/web/packages/lambda.r/index.html

On 20 November 2013 10:02,  mohan.radhakrish...@polarisft.com wrote:
 Hi,
 '
 Not specific to 'R'. I search for patterns and found
 http://patternsinfp.wordpress.com/ which is too heavy for me. There is a
 'Pragmatic Programmer' book on such patterns for Scala and Clojure. Is
 there anything for R ?

 I wanted to code this. Is there a functional pattern in R for multiple
 'if' loops like this ?

 if( is.na(str_extract(input,T[^.]*)[1]) ){

 }else{


 Thanks,
 Mohan


 This e-Mail may contain proprietary and confidential information and is sent 
 for the intended recipient(s) only.  If by an addressing or transmission 
 error this mail has been misdirected to you, you are requested to delete this 
 mail immediately. You are also hereby notified that any use, any form of 
 reproduction, dissemination, copying, disclosure, modification, distribution 
 and/or publication of this e-mail message, contents or its attachment other 
 than by its intended recipient/s is strictly prohibited.

 Visit us at http://www.polarisFT.com

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sending a matrix in an email

2013-11-18 Thread Suzen, Mehmet
On 18 November 2013 05:37, Ira Fuchs irafu...@gmail.com wrote:
 I have a matrix which has colnames and I would like to send this matrix using 
 sendmailR. How can I convert this simple matrix

My 1 cent; In case of large objects or full session, suitable for
attachment; RData might be more convenient, i.e., ?save or ?save.image

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fitting arbitrary curve to 1D data with error bars

2013-11-14 Thread Suzen, Mehmet
Maybe you are after weights option given by 'lm' or 'glm'

See: 
http://stackoverflow.com/questions/6375650/function-for-weighted-least-squares-estimates

On 14 November 2013 10:01, Erkcan Özcan erk...@hotmail.com wrote:
 Thanks, but if you have another closer look to my post, you will see that my 
 question has nothing to do with drawing error bars on a plot.

 What I want is to do a curve fit to a data with error bars.

 Best,
 e.

 On 14 Nov 2013, at 04:21, Suzen, Mehmet wrote:

 If you are after adding error bars in a scatter plot; one example is
 given below :

 #some example data
 set.seed(42)
 df - data.frame(x = rep(1:10,each=5), y = rnorm(50))

 #calculate mean, min and max for each x-value
 library(plyr)
 df2 - ddply(df,.(x),function(df)
 c(mean=mean(df$y),min=min(df$y),max=max(df$y)))

 #plot error bars
 library(Hmisc)
 with(df2,errbar(x,mean,max,min))
 grid(nx=NA,ny=NULL)

 (From: 
 http://stackoverflow.com/questions/13032777/scatter-plot-with-error-bars)



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fitting arbitrary curve to 1D data with error bars

2013-11-13 Thread Suzen, Mehmet
If you are after adding error bars in a scatter plot; one example is
given below :

#some example data
set.seed(42)
df - data.frame(x = rep(1:10,each=5), y = rnorm(50))

#calculate mean, min and max for each x-value
library(plyr)
df2 - ddply(df,.(x),function(df)
c(mean=mean(df$y),min=min(df$y),max=max(df$y)))

#plot error bars
library(Hmisc)
with(df2,errbar(x,mean,max,min))
grid(nx=NA,ny=NULL)

(From: http://stackoverflow.com/questions/13032777/scatter-plot-with-error-bars)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] computation of hessian matrix

2013-11-01 Thread Suzen, Mehmet
On 1 November 2013 11:06, IZHAK shabsogh ishaqb...@yahoo.com wrote:
 below is a code to compute hessian matrix , which i need to generate 29 
 number of different matrices for example first

You may consider using Numerical Derivatives package for that instead, see:
http://cran.r-project.org/web/packages/numDeriv/vignettes/Guide.pdf

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Revo R for Arima Implementation

2013-10-30 Thread Suzen, Mehmet
On 28 October 2013 14:26, Anindita Chattopadhyay
anindit...@mu-sigma.com wrote:
 We need to understand how we can implement this in Revo R.

Most of the people here contribute to community of R not Revo R. I
think it is unfair of you to request from this list to solve your Revo
R issue.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Incorporate Julia into R

2013-10-17 Thread Suzen, Mehmet
On 17 October 2013 15:38, Timo Schmid timo_sch...@hotmail.com wrote:
 I have some code in R with a lot of matrix multiplication and inverting. R 
 can be very slow for larger matrices like 5000x5000.
 I have seen the new programming language Julia (www.julialang.org) which is 
 quite fast in doing matrix algebra.

Its not Julia, but LAPACK.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] RStudio with Revolution-R

2013-10-15 Thread Suzen, Mehmet
On 15 October 2013 01:27, Maxim Linchits mlinch...@gmail.com wrote:
 Hello,
 Is it possible to use Revolution-R's multithreading capability with
 RStudio as the IDE? Apparently, RevoR is available for Ubuntu,

Wrong list!

But for reference:
http://stackoverflow.com/questions/10835122/multithreading-with-r

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   >