thisisnic opened a new issue, #38916:
URL: https://github.com/apache/arrow/issues/38916

   ### Describe the enhancement requested
   
   When we print a dataset, we get a short description of the dataset and then 
the full schema with one column on each line. This looks fine for datasets with 
few columns, but can grow unwieldy and messy.  An example from a recent dataset 
I've been working with:
   
   ```
   > pums_person
   FileSystemDataset with 832 Parquet files
   SPORDER: int32
   RT: dictionary<values=string, indices=int32, ordered>
   SERIALNO: string
   PUMA: string
   ST: string
   ADJUST: int32
   PWGTP: int32
   AGEP: int32
   CIT: dictionary<values=string, indices=int32, ordered>
   COW: dictionary<values=string, indices=int32, ordered>
   DDRS: dictionary<values=string, indices=int32, ordered>
   DEYE: dictionary<values=string, indices=int32, ordered>
   DOUT: dictionary<values=string, indices=int32, ordered>
   DPHY: dictionary<values=string, indices=int32, ordered>
   DREM: dictionary<values=string, indices=int32, ordered>
   DWRK: dictionary<values=string, indices=int32, ordered>
   ENG: dictionary<values=string, indices=int32, ordered>
   FER: dictionary<values=string, indices=int32, ordered>
   GCL: dictionary<values=string, indices=int32, ordered>
   GCM: dictionary<values=string, indices=int32, ordered>
   GCR: dictionary<values=string, indices=int32, ordered>
   INTP: int32
   JWMNP: int32
   JWRIP: dictionary<values=string, indices=int32, ordered>
   JWTR: dictionary<values=string, indices=int32, ordered>
   LANX: dictionary<values=string, indices=int32, ordered>
   MAR: dictionary<values=string, indices=int32, ordered>
   MIG: dictionary<values=string, indices=int32, ordered>
   MIL: dictionary<values=string, indices=int32, ordered>
   MILY: dictionary<values=string, indices=int32, ordered>
   MLPA: dictionary<values=string, indices=int32, ordered>
   MLPB: dictionary<values=string, indices=int32, ordered>
   MLPC: dictionary<values=string, indices=int32, ordered>
   MLPD: dictionary<values=string, indices=int32, ordered>
   MLPE: dictionary<values=string, indices=int32, ordered>
   MLPF: dictionary<values=string, indices=int32, ordered>
   MLPG: dictionary<values=string, indices=int32, ordered>
   MLPH: dictionary<values=string, indices=int32, ordered>
   MLPI: dictionary<values=string, indices=int32, ordered>
   MLPJ: dictionary<values=string, indices=int32, ordered>
   MLPK: dictionary<values=string, indices=int32, ordered>
   NWAB: dictionary<values=string, indices=int32, ordered>
   NWAV: dictionary<values=string, indices=int32, ordered>
   NWLA: dictionary<values=string, indices=int32, ordered>
   NWLK: dictionary<values=string, indices=int32, ordered>
   NWRE: dictionary<values=string, indices=int32, ordered>
   OIP: int32
   PAP: int32
   REL: string
   RETP: int32
   SCH: dictionary<values=string, indices=int32, ordered>
   SCHG: string
   SCHL: string
   SEMP: int32
   SEX: dictionary<values=string, indices=int32, ordered>
   SSIP: int32
   SSP: int32
   WAGP: int32
   WKHP: int32
   WKL: dictionary<values=string, indices=int32, ordered>
   WKW: dictionary<values=string, indices=int32, ordered>
   YOEP: string
   UWRK: dictionary<values=string, indices=int32, ordered>
   ANC: dictionary<values=string, indices=int32, ordered>
   ANC1P: string
   ANC2P: string
   DECADE: dictionary<values=string, indices=int32, ordered>
   DRIVESP: dictionary<values=string, indices=int32, ordered>
   DS: dictionary<values=string, indices=int32, ordered>
   ESP: dictionary<values=string, indices=int32, ordered>
   ESR: dictionary<values=string, indices=int32, ordered>
   HISP: string
   INDP: string
   JWAP: string
   JWDP: string
   LANP: string
   MIGPUMA: string
   MIGSP: string
   MSP: dictionary<values=string, indices=int32, ordered>
   NAICSP: string
   NATIVITY: dictionary<values=string, indices=int32, ordered>
   OC: dictionary<values=string, indices=int32, ordered>
   OCCP: string
   PAOC: dictionary<values=string, indices=int32, ordered>
   PERNP: int32
   PINCP: int32
   POBP: string
   POVPIP: int32
   POWPUMA: string
   POWSP: string
   QTRBIR: dictionary<values=string, indices=int32, ordered>
   RAC1P: dictionary<values=string, indices=int32, ordered>
   RAC2P: string
   RAC3P: string
   RACAIAN: dictionary<values=string, indices=int32, ordered>
   RACASN: dictionary<values=string, indices=int32, ordered>
   RACBLK: dictionary<values=string, indices=int32, ordered>
   RACNHPI: dictionary<values=string, indices=int32, ordered>
   RACNUM: int32
   RACSOR: dictionary<values=string, indices=int32, ordered>
   RACWHT: dictionary<values=string, indices=int32, ordered>
   RC: dictionary<values=string, indices=int32, ordered>
   SFN: dictionary<values=string, indices=int32, ordered>
   SFR: dictionary<values=string, indices=int32, ordered>
   SOCP: string
   VPS: string
   WAOB: dictionary<values=string, indices=int32, ordered>
   FAGEP: dictionary<values=string, indices=int32, ordered>
   FANCP: dictionary<values=string, indices=int32, ordered>
   FCITP: dictionary<values=string, indices=int32, ordered>
   FCOWP: dictionary<values=string, indices=int32, ordered>
   FDDRSP: dictionary<values=string, indices=int32, ordered>
   FDEYEP: dictionary<values=string, indices=int32, ordered>
   FDOUTP: dictionary<values=string, indices=int32, ordered>
   FDPHYP: dictionary<values=string, indices=int32, ordered>
   FDREMP: dictionary<values=string, indices=int32, ordered>
   FDWRKP: dictionary<values=string, indices=int32, ordered>
   FENGP: dictionary<values=string, indices=int32, ordered>
   FESRP: dictionary<values=string, indices=int32, ordered>
   FFERP: dictionary<values=string, indices=int32, ordered>
   FGCLP: dictionary<values=string, indices=int32, ordered>
   FGCMP: dictionary<values=string, indices=int32, ordered>
   FGCRP: dictionary<values=string, indices=int32, ordered>
   FHISP: dictionary<values=string, indices=int32, ordered>
   FINDP: dictionary<values=string, indices=int32, ordered>
   FINTP: dictionary<values=string, indices=int32, ordered>
   FJWDP: dictionary<values=string, indices=int32, ordered>
   FJWMNP: dictionary<values=string, indices=int32, ordered>
   FJWRIP: dictionary<values=string, indices=int32, ordered>
   FJWTRP: dictionary<values=string, indices=int32, ordered>
   FLANP: dictionary<values=string, indices=int32, ordered>
   FLANXP: dictionary<values=string, indices=int32, ordered>
   FMARP: dictionary<values=string, indices=int32, ordered>
   FMIGP: dictionary<values=string, indices=int32, ordered>
   FMIGSP: dictionary<values=string, indices=int32, ordered>
   FMILPP: dictionary<values=string, indices=int32, ordered>
   FMILSP: dictionary<values=string, indices=int32, ordered>
   FMILYP: dictionary<values=string, indices=int32, ordered>
   FOCCP: dictionary<values=string, indices=int32, ordered>
   FOIP: dictionary<values=string, indices=int32, ordered>
   FPAP: dictionary<values=string, indices=int32, ordered>
   FPOBP: dictionary<values=string, indices=int32, ordered>
   FPOWSP: dictionary<values=string, indices=int32, ordered>
   FRACP: dictionary<values=string, indices=int32, ordered>
   FRELP: dictionary<values=string, indices=int32, ordered>
   FRETP: dictionary<values=string, indices=int32, ordered>
   FSCHGP: dictionary<values=string, indices=int32, ordered>
   FSCHLP: dictionary<values=string, indices=int32, ordered>
   FSCHP: dictionary<values=string, indices=int32, ordered>
   FSEMP: dictionary<values=string, indices=int32, ordered>
   FSEXP: dictionary<values=string, indices=int32, ordered>
   FSSIP: dictionary<values=string, indices=int32, ordered>
   FSSP: dictionary<values=string, indices=int32, ordered>
   FWAGP: dictionary<values=string, indices=int32, ordered>
   FWKHP: dictionary<values=string, indices=int32, ordered>
   FWKLP: dictionary<values=string, indices=int32, ordered>
   FWKWP: dictionary<values=string, indices=int32, ordered>
   FYOEP: dictionary<values=string, indices=int32, ordered>
   PWGTP1: int32
   PWGTP2: int32
   PWGTP3: int32
   PWGTP4: int32
   PWGTP5: int32
   PWGTP6: int32
   PWGTP7: int32
   PWGTP8: int32
   PWGTP9: int32
   PWGTP10: int32
   PWGTP11: int32
   PWGTP12: int32
   PWGTP13: int32
   PWGTP14: int32
   PWGTP15: int32
   PWGTP16: int32
   PWGTP17: int32
   PWGTP18: int32
   PWGTP19: int32
   PWGTP20: int32
   PWGTP21: int32
   PWGTP22: int32
   PWGTP23: int32
   PWGTP24: int32
   PWGTP25: int32
   PWGTP26: int32
   PWGTP27: int32
   PWGTP28: int32
   PWGTP29: int32
   PWGTP30: int32
   PWGTP31: int32
   PWGTP32: int32
   PWGTP33: int32
   PWGTP34: int32
   PWGTP35: int32
   PWGTP36: int32
   PWGTP37: int32
   PWGTP38: int32
   PWGTP39: int32
   PWGTP40: int32
   PWGTP41: int32
   PWGTP42: int32
   PWGTP43: int32
   PWGTP44: int32
   PWGTP45: int32
   PWGTP46: int32
   PWGTP47: int32
   PWGTP48: int32
   PWGTP49: int32
   PWGTP50: int32
   PWGTP51: int32
   PWGTP52: int32
   PWGTP53: int32
   PWGTP54: int32
   PWGTP55: int32
   PWGTP56: int32
   PWGTP57: int32
   PWGTP58: int32
   PWGTP59: int32
   PWGTP60: int32
   PWGTP61: int32
   PWGTP62: int32
   PWGTP63: int32
   PWGTP64: int32
   PWGTP65: int32
   PWGTP66: int32
   PWGTP67: int32
   PWGTP68: int32
   PWGTP69: int32
   PWGTP70: int32
   PWGTP71: int32
   PWGTP72: int32
   PWGTP73: int32
   PWGTP74: int32
   PWGTP75: int32
   PWGTP76: int32
   PWGTP77: int32
   PWGTP78: int32
   PWGTP79: int32
   PWGTP80: int32
   NOP: dictionary<values=string, indices=int32, ordered>
   ADJINC: double
   CITWP: string
   DEAR: dictionary<values=string, indices=int32, ordered>
   DRAT: dictionary<values=string, indices=int32, ordered>
   DRATX: dictionary<values=string, indices=int32, ordered>
   HINS1: dictionary<values=string, indices=int32, ordered>
   HINS2: dictionary<values=string, indices=int32, ordered>
   HINS3: dictionary<values=string, indices=int32, ordered>
   HINS4: dictionary<values=string, indices=int32, ordered>
   HINS5: dictionary<values=string, indices=int32, ordered>
   HINS6: dictionary<values=string, indices=int32, ordered>
   HINS7: dictionary<values=string, indices=int32, ordered>
   MARHD: dictionary<values=string, indices=int32, ordered>
   MARHM: dictionary<values=string, indices=int32, ordered>
   MARHT: dictionary<values=string, indices=int32, ordered>
   MARHW: dictionary<values=string, indices=int32, ordered>
   MARHYP: string
   DIS: dictionary<values=string, indices=int32, ordered>
   HICOV: dictionary<values=string, indices=int32, ordered>
   PRIVCOV: dictionary<values=string, indices=int32, ordered>
   PUBCOV: dictionary<values=string, indices=int32, ordered>
   FCITWP: dictionary<values=string, indices=int32, ordered>
   FDEARP: dictionary<values=string, indices=int32, ordered>
   FDRATP: dictionary<values=string, indices=int32, ordered>
   FDRATXP: dictionary<values=string, indices=int32, ordered>
   FHINS1P: dictionary<values=string, indices=int32, ordered>
   FHINS2P: dictionary<values=string, indices=int32, ordered>
   FHINS3P: dictionary<values=string, indices=int32, ordered>
   FHINS4P: dictionary<values=string, indices=int32, ordered>
   FHINS5P: dictionary<values=string, indices=int32, ordered>
   FHINS6P: dictionary<values=string, indices=int32, ordered>
   FHINS7P: dictionary<values=string, indices=int32, ordered>
   FMARHDP: dictionary<values=string, indices=int32, ordered>
   FMARHMP: dictionary<values=string, indices=int32, ordered>
   FMARHTP: dictionary<values=string, indices=int32, ordered>
   FMARHWP: dictionary<values=string, indices=int32, ordered>
   FMARHYP: dictionary<values=string, indices=int32, ordered>
   WRK: dictionary<values=string, indices=int32, ordered>
   FOD1P: string
   FOD2P: string
   SCIENGP: dictionary<values=string, indices=int32, ordered>
   SCIENGRLP: dictionary<values=string, indices=int32, ordered>
   FFODP: dictionary<values=string, indices=int32, ordered>
   FHINS3C: dictionary<values=string, indices=int32, ordered>
   FHINS4C: dictionary<values=string, indices=int32, ordered>
   FHINS5C: dictionary<values=string, indices=int32, ordered>
   RELP: string
   FWRKP: dictionary<values=string, indices=int32, ordered>
   FDISP: dictionary<values=string, indices=int32, ordered>
   FPERNP: dictionary<values=string, indices=int32, ordered>
   FPINCP: dictionary<values=string, indices=int32, ordered>
   FPRIVCOVP: dictionary<values=string, indices=int32, ordered>
   FPUBCOVP: dictionary<values=string, indices=int32, ordered>
   RACNH: dictionary<values=string, indices=int32, ordered>
   RACPI: dictionary<values=string, indices=int32, ordered>
   SSPA: dictionary<values=string, indices=int32, ordered>
   MLPCD: dictionary<values=string, indices=int32, ordered>
   MLPFG: dictionary<values=string, indices=int32, ordered>
   FHICOVP: dictionary<values=string, indices=int32, ordered>
   DIVISION: dictionary<values=string, indices=int32, ordered>
   REGION: dictionary<values=string, indices=int32, ordered>
   HIMRKS: dictionary<values=string, indices=int32, ordered>
   JWTRNS: dictionary<values=string, indices=int32, ordered>
   RELSHIPP: string
   WKWN: int32
   FHIMRKSP: dictionary<values=string, indices=int32, ordered>
   FJWTRNSP: dictionary<values=string, indices=int32, ordered>
   FRELSHIPP: dictionary<values=string, indices=int32, ordered>
   FWKWNP: dictionary<values=string, indices=int32, ordered>
   MLPIK: dictionary<values=string, indices=int32, ordered>
   year: int32
   location: string
   ```
   
   We could do something like the tibble preview, with instructions to call 
`schema()` to view the full schema.  The tibble preview for the dataset, for 
contrast:
   
   ```
   > head(pums_person, 0) %>% collect()
   # A tibble: 0 × 311
   # ℹ 311 variables: SPORDER <int>, RT <ord>, SERIALNO <chr>, PUMA <chr>, ST 
<chr>, ADJUST <int>, PWGTP <int>, AGEP <int>,
   #   CIT <ord>, COW <ord>, DDRS <ord>, DEYE <ord>, DOUT <ord>, DPHY <ord>, 
DREM <ord>, DWRK <ord>, ENG <ord>, FER <ord>,
   #   GCL <ord>, GCM <ord>, GCR <ord>, INTP <int>, JWMNP <int>, JWRIP <ord>, 
JWTR <ord>, LANX <ord>, MAR <ord>, MIG <ord>,
   #   MIL <ord>, MILY <ord>, MLPA <ord>, MLPB <ord>, MLPC <ord>, MLPD <ord>, 
MLPE <ord>, MLPF <ord>, MLPG <ord>, MLPH <ord>,
   #   MLPI <ord>, MLPJ <ord>, MLPK <ord>, NWAB <ord>, NWAV <ord>, NWLA <ord>, 
NWLK <ord>, NWRE <ord>, OIP <int>, PAP <int>,
   #   REL <chr>, RETP <int>, SCH <ord>, SCHG <chr>, SCHL <chr>, SEMP <int>, 
SEX <ord>, SSIP <int>, SSP <int>, WAGP <int>,
   #   WKHP <int>, WKL <ord>, WKW <ord>, YOEP <chr>, UWRK <ord>, ANC <ord>, 
ANC1P <chr>, ANC2P <chr>, DECADE <ord>, …
   # ℹ Use `colnames()` to see all variable names
   ```
   
   
   ### Component(s)
   
   R


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to