alamb edited a comment on issue #1441:
URL: 
https://github.com/apache/arrow-datafusion/issues/1441#issuecomment-1000429890


   Cross referencing the data output from pandas and the parquer reader, 
   Here are the rows from pandas and arrow/parquet that have non null values 
for `stop_name`:
   
   Pandas:
   ```
   1523,2021-11-15 02:40:32,54827807,102,PUSTELNIK
   2475,2021-11-15 04:54:14,54807500,102,PKP Olszynka Grochowska
   6218,2021-11-15 10:25:27,54802989,104,Żerań FSO
   7286,2021-11-15 04:29:31,54787914,140,METRO RATUSZ-ARSENAŁ
   7431,2021-11-15 08:23:38,54793831,157,Rokosowska
   7433,2021-11-15 10:08:11,54793833,157,Sienna
   7438,2021-11-15 11:56:45,54793835,157,Wola-Ratusz
   7447,2021-11-15 21:15:54,54793844,157,Miła
   7479,2021-11-15 12:06:00,54793886,157,Hala Kopińska
   7692,2021-11-15 10:56:08,54793834,157,Mennica
   7693,2021-11-15 11:52:22,54793835,157,Smocza
   7694,2021-11-15 11:58:07,54793835,157,Wola-Ratusz
   7696,2021-11-15 14:42:33,54793838,157,Wawelska
   7702,2021-11-15 20:24:26,54793843,157,Muranowska
   7819,2021-11-15 04:59:07,54793828,157,pl.Starynkiewicza
   7824,2021-11-15 08:08:57,54793831,157,Chłodna
   7827,2021-11-15 10:11:37,54793833,157,pl.Zawiszy
   7828,2021-11-15 10:49:38,54793834,157,pl.Zawiszy
   7829,2021-11-15 10:53:04,54793834,157,Sienna
   7830,2021-11-15 12:16:45,54793835,157,Dobosza
   
   ```
   
   And arrow says the following (interestingly, note that the `PUSTELNIK` is 
repeated and then the sequence of values is very similar but offset)
   ```
   1523,2021-11-15 02:40:32,54827807,102,PUSTELNIK
   2475,2021-11-15 04:54:14,54807500,102,PKP Olszynka Grochowska
   6218,2021-11-15 10:25:27,54802989,104,PUSTELNIK
   7286,2021-11-15 04:29:31,54787914,140,PUSTELNIK
   7431,2021-11-15 08:23:38,54793831,157,PUSTELNIK
   7433,2021-11-15 10:08:11,54793833,157,PUSTELNIK
   7438,2021-11-15 11:56:45,54793835,157,PUSTELNIK
   7447,2021-11-15 21:15:54,54793844,157,PUSTELNIK
   7479,2021-11-15 12:06:00,54793886,157,Żerań FSO
   7692,2021-11-15 10:56:08,54793834,157,Rokosowska
   7693,2021-11-15 11:52:22,54793835,157,Sienna
   7694,2021-11-15 11:58:07,54793835,157,Wola-Ratusz
   7696,2021-11-15 14:42:33,54793838,157,Miła
   7702,2021-11-15 20:24:26,54793843,157,Hala Kopińska
   7819,2021-11-15 04:59:07,54793828,157,Chłodna
   7824,2021-11-15 08:08:57,54793831,157,Mennica
   7826,2021-11-15 08:36:09,54793832,157,Smocza
   7827,2021-11-15 10:11:37,54793833,157,Wola-Ratusz
   7828,2021-11-15 10:49:38,54793834,157,Wawelska
   7829,2021-11-15 10:53:04,54793834,157,Muranowska
   7830,2021-11-15 12:16:45,54793835,157,pl.Starynkiewicza
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to