Evan Chan created SPARK-3297:
--------------------------------

             Summary: [Spark SQL][UI] SchemaRDD toString with many columns 
messes up Storage tab display
                 Key: SPARK-3297
                 URL: https://issues.apache.org/jira/browse/SPARK-3297
             Project: Spark
          Issue Type: Bug
          Components: SQL, Web UI
    Affects Versions: 1.0.2
            Reporter: Evan Chan
            Priority: Minor


When a SchemaRDD with many columns (for example, 57 columns in this example) is 
cached using sqlContext.cacheTable, the Storage tab of the driver Web UI 
display gets messed up, because the long string of the SchemaRDD causes the 
first column to be much much wider than the others, and in fact much wider than 
the width of the browser.  It would be nice to have the first column be 
restricted to, say, 50% of the width of the browser window, with some minimum.

For example this is the SchemaRDD text for my table:

        RDD Storage Info for ExistingRdd 
[ActionGeo_ADM1Code#198,ActionGeo_CountryCode#199,ActionGeo_FeatureID#200,ActionGeo_FullName#201,ActionGeo_Lat#202,ActionGeo_Long#203,ActionGeo_Type#204,Actor1Code#205,Actor1CountryCode#206,Actor1EthnicCode#207,Actor1Geo_ADM1Code#208,Actor1Geo_CountryCode#209,Actor1Geo_FeatureID#210,Actor1Geo_FullName#211,Actor1Geo_Lat#212,Actor1Geo_Long#213,Actor1Geo_Type#214,Actor1KnownGroupCode#215,Actor1Name#216,Actor1Religion1Code#217,Actor1Religion2Code#218,Actor1Type1Code#219,Actor1Type2Code#220,Actor1Type3Code#221,Actor2Code#222,Actor2CountryCode#223,Actor2EthnicCode#224,Actor2Geo_ADM1Code#225,Actor2Geo_CountryCode#226,Actor2Geo_FeatureID#227,Actor2Geo_FullName#228,Actor2Geo_Lat#229,Actor2Geo_Long#230,Actor2Geo_Type#231,Actor2KnownGroupCode#232,Actor2Name#233,Actor2Religion1Code#234,Actor2Religion2Code#235,Actor2Type1Code#236,Actor2Type2Code#237,Actor2Type3Code#238,AvgTone#239,DATEADDED#240,Day#241,EventBaseCode#242,EventCode#243,EventId#244,EventRootCode#245,FractionDate#246,GoldsteinScale#247,IsRootEvent#248,MonthYear#249,NumArticles#250,NumMentions#251,NumSources#252,QuadClass#253,Year#254],
 MappedRDD[200]

I would personally love to fix the toString method to not necessarily print 
every column, but to cut it off after a while.  This would aid the printout in 
the Spark Shell as well.  For example:

[ActionGeo_ADM1Code#198,ActionGeo_CountryCode#199,ActionGeo_FeatureID#200,ActionGeo_FullName#201,ActionGeo_Lat#202
 .... and 52 more columns]



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to