Yes, thanks. Are other files reading ok on Windows or is it just this particular file?
e.g. does this work :
fread("http://www.stats.ox.ac.uk/pub/datasets/csb/ch11b.dat";)

[ I don't have Windows within easy reach. ]

On 06/03/14 12:43, carrieromichele wrote:
I quickly read the last mail, Is this the test you needed guys?

> fread("http://www.cdc.gov/growthcharts/data/zscore/statage.csv";, verbose=FALSE)
trying URL 'http://www.cdc.gov/growthcharts/data/zscore/statage.csv'
Content type 'application/octet-stream' length 66087 bytes (64 Kb)
opened URL
downloaded 64 Kb

Empty data.table (0 rows) of 14 cols: Sex,Agemos,L,M,S,P3...
> sessionInfo()
R version 3.0.2 (2013-09-25)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=English_United Kingdom.1252 LC_CTYPE=English_United Kingdom.1252
[3] LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C
[5] LC_TIME=English_United Kingdom.1252

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] data.table_1.9.3

loaded via a namespace (and not attached):
[1] plyr_1.8.1 Rcpp_0.11.0 reshape2_1.2.2 Rook_1.0-9 stringr_0.6.2 tools_3.0.2 > fread("http://www.cdc.gov/growthcharts/data/zscore/statage.csv";, verbose=FALSE)
trying URL 'http://www.cdc.gov/growthcharts/data/zscore/statage.csv'
Content type 'application/octet-stream' length 66087 bytes (64 Kb)
opened URL
downloaded 64 Kb

Empty data.table (0 rows) of 14 cols: Sex,Agemos,L,M,S,P3...


On 6 March 2014 12:34, Matt Dowle <[email protected] <mailto:[email protected]>> wrote:


    Works for me as well on linux,  same output as Kevin's.

    I was perplexed as to why Farrel's output has :

       File opened, filesize is 6.2E-05B
    but we see :

       File opened, filesize is 0.000 GB
    That line is switched depending on Windows or not. Comparing them :

    // On Windows :
    if (verbose) Rprintf("File opened, filesize is %.3 GB\n",
    1.0*filesize/(1024*1024*1024));

    // On non-Windows :
    if (verbose) Rprintf("File opened, filesize is %.3f GB\n",
    1.0*filesize/(1024*1024*1024));

    So, a missing "f". Just committed a fix for that (r1223). That
    line is part of a block that is necessarily different on Windows
    because its file and mmap commands are different.  The missing 'f'
    could have feasibly corrupted memory somehow (strange that the "G"
    of "GB" got overwritten) and if so would explain why it thought it
    got to the end of the file before seeing the \n after the \r.

    Farrel - does v1.9.2 work for you on Windows with verbose=FALSE?
    If yes, then very likely verbose=TRUE will now work with commit
    1223.  Best to start with a new R session to clear any possible
    memory corruption and then try :

fread("http://www.cdc.gov/growthcharts/data/zscore/statage.csv";,
    verbose=FALSE)

    If not, can anyone else reproduce on Windows? If so, I'll need to
    debug it on Windows.

    Thanks,
    Matt



    On 06/03/14 05:19, Kevin Ushey wrote:

        I think Matt and Arun will have more information -- IIUC, fread is
        only now gaining support for reading from URLs on Windows.

        Something strange: I get different output on the file
        structure with
        fread. Posting in case it's useful:

            statagecdc <-
            fread("http://www.cdc.gov/growthcharts/data/zscore/statage.csv";,
            verbose=T)

        Input contains no \n. Taking this to be a filename to open
        File opened, filesize is 0.000 GB
        File is opened and mapped ok
        Detected eol as \r\n (CRLF) in that order, the Windows standard.
        Using line 30 to detect sep (the last non blank line in the first
        'autostart') ... sep=','
        Found 14 columns
        First row with 14 fields occurs on line 1 (either column names or
        first row of data)
        All the fields on line 1 are character fields. Treating as the
        column names.
        Count of eol after first data row: 437
        Subtracted 1 for last eol and any trailing empty lines,
        leaving 436 data rows
        Type codes: 13333333333333 (first 5 rows)
        Type codes: 13333333333333 (+middle 5 rows)
        Type codes: 13333333333333 (+last 5 rows)
        Type codes: 13333333333333 (after applying colClasses and
        integer64)
        Type codes: 13333333333333 (after applying drop or select (if
        supplied)
        Allocating 14 column slots (14 - 0 NULL)
            0.000s ( 13%) Memory map (rerun may be quicker)
            0.000s (  4%) sep and header detection
            0.000s ( 13%) Count rows (wc -l)
            0.001s ( 49%) Column type detection (first, middle and
        last 5 rows)
            0.000s (  1%) Allocation of 436x14 result (xMB) in RAM
            0.000s ( 19%) Reading data
            0.000s (  0%) Allocation for type bumps (if any),
        including gc time
        if triggered
            0.000s (  0%) Coercing data already read in type bumps (if
        any)
            0.000s (  0%) Changing na.strings to NA
            0.002s        Total

        Note that fread sees \r\n as newlines for me.

            sessionInfo()

        R Under development (unstable) (2014-02-12 r64976)
        Platform: x86_64-apple-darwin13.0.0 (64-bit)

        locale:
        [1] en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8

        attached base packages:
        [1] stats     graphics  grDevices utils     datasets  methods
          base

        other attached packages:
        [1] data.table_1.9.1     knitr_1.5.15 devtools_1.4.1.99
        BiocInstaller_1.13.3

        loaded via a namespace (and not attached):
          [1] compiler_3.1.0    digest_0.6.4  evaluate_0.5.1
        formatR_0.10      httr_0.2          memoise_0.1
          [7] parallel_3.1.0    plyr_1.8  Rcpp_0.11.0.3
        RCurl_1.95-4.1    reshape2_1.3.0.99 stringr_0.6.2
        [13] tools_3.1.0       whisker_0.3-2

        Kevin

        On Wed, Mar 5, 2014 at 9:04 PM, Farrel Buchinsky
        <[email protected] <mailto:[email protected]>> wrote:

                sessionInfo()

            R version 3.0.2 (2013-09-25)
            Platform: x86_64-w64-mingw32/x64 (64-bit)

            locale:
            [1] LC_COLLATE=English_United States.1252
             LC_CTYPE=English_United
            States.1252    LC_MONETARY=English_United States.1252
            [4] LC_NUMERIC=C LC_TIME=English_United
            States.1252

            attached base packages:
[1] grid stats graphics grDevices utils datasets methods
            base

            other attached packages:
[1] reshape2_1.2.2 data.table_1.9.2 gridExtra_0.9.1 ggplot2_0.9.3.1
            RGoogleDocs_0.7-0

            loaded via a namespace (and not attached):
[1] colorspace_1.2-4 dichromat_2.0-0 digest_0.6.4 gtable_0.1.2
            labeling_0.2       MASS_7.3-29        munsell_0.4.2
              [8] plyr_1.8.1         proto_0.3-10 RColorBrewer_1.0-5
            Rcpp_0.11.0
            RCurl_1.95-4.1     scales_0.2.3       stringr_0.6.2
            [15] tools_3.0.2        XML_3.98-1.1

            Farrel Buchinsky
            Google Voice Tel: (412) 567-7870 <tel:%28412%29%20567-7870>


            On Wed, Mar 5, 2014 at 10:55 PM, Kevin Ushey
            <[email protected] <mailto:[email protected]>> wrote:

                Works fine for me with data.table 1.9.1 on OS X. What
                is your
                sessionInfo()?

                Kevin

                On Wed, Mar 5, 2014 at 7:53 PM, Farrel Buchinsky
                <[email protected] <mailto:[email protected]>> wrote:

                    Any idea why I am getting a data.table with
                    headers only and zero data?
                    How
                    can I get around the problem.

                    
fread("http://www.cdc.gov/growthcharts/data/zscore/statage.csv";,
                    verbose=T)
                    fails
                    
read.csv("http://www.cdc.gov/growthcharts/data/zscore/statage.csv";)
                    succeeds

                        statagecdc <-
                        
fread("http://www.cdc.gov/growthcharts/data/zscore/statage.csv";,
                        verbose=T)

                    trying URL
                    'http://www.cdc.gov/growthcharts/data/zscore/statage.csv'
                    Content type 'application/octet-stream' length
                    66087 bytes (64 Kb)
                    opened URL
                    downloaded 64 Kb

                    Input contains no \n. Taking this to be a filename
                    to open
                    File opened, filesize is  6.2E-05B
                    File is opened and mapped ok
                    Detected eol as \r only (no \n afterwards). An old
                    Mac 9 standard,
                    discontinued in 2002 according to Wikipedia.
                    Using line 1 to detect sep (the last non blank
                    line in the first
                    'autostart') ... sep=','
                    Found 14 columns
                    First row with 14 fields occurs on line 1 (either
                    column names or first
                    row
                    of data)
                    All the fields on line 1 are character fields.
                    Treating as the column
                    names.
                    Byte after header row is eof or eol, 0 data rows
                    present.
                    Type codes: 00000000000000 (first 5 rows)
                    Type codes: 00000000000000 (after applying
                    colClasses and integer64)
                    Type codes: 00000000000000 (after applying drop or
                    select (if supplied)
                    Allocating 14 column slots (14 - 0 NULL)
                        0.000s (  0%) Memory map (rerun may be quicker)
                        0.000s (  0%) sep and header detection
                        0.001s (100%) Count rows (wc -l)
                        0.000s (  0%) Column type detection (first,
                    middle and last 5 rows)
                        0.000s (  0%) Allocation of 0x14 result (xMB)
                    in RAM
                        0.000s (  0%) Reading data
                        0.000s (  0%) Allocation for type bumps (if
                    any), including gc time
                    if
                    triggered
                        0.000s (  0%) Coercing data already read in
                    type bumps (if any)
                        0.000s (  0%) Changing na.strings to NA
                        0.001s        Total


                    Thanks a lot.

                    Farrel Buchinsky
                    Google Voice Tel: (412) 567-7870
                    <tel:%28412%29%20567-7870>

                    _______________________________________________
                    datatable-help mailing list
                    [email protected]
                    <mailto:[email protected]>

                    
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help


        _______________________________________________
        datatable-help mailing list
        [email protected]
        <mailto:[email protected]>
        
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help


    _______________________________________________
    datatable-help mailing list
    [email protected]
    <mailto:[email protected]>
    https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help




--

*PRIVATE
**T:*+44 (0)77 3248 1517 *|**E:*[email protected] <mailto:[email protected]><http://@gmail.com>

*OFFICE
T:*+44 (0)20 8236 8992 *|**E:*[email protected] <mailto:[email protected]>_
_*T:*www.evolve-analytics.com <http://www.evolve-analytics.com>


<http://www.evolve-analytics.com>


_______________________________________________
datatable-help mailing list
[email protected]
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help

Reply via email to