Hello, >From the HDB guide ( http://hdb.docs.pivotal.io/212/hawq/reference/sql/CREATE-EXTERNAL-TABLE.html#topic1__section4), I read about Web external tables
*Note: ON ALL/HOST is deprecated when creating a readable external table, as HAWQ cannot guarantee scheduling executors on a specific host. Instead, use ON MASTER, ON <number>, or SEGMENT <virtual_segment> to specify which segment instances will execute the command.* In my opinion, if possible, we should re-introduce the ON ALL option for the external WEB tables, I am concerned with the option ON <number> in the external web table definition. We have to use the number of current hosts. So if we expand the cluster, we will have to change this external web table. - If we have a value smaller than the actual number of hosts, some rows will miss. - If we have a value greater than the actual number of hosts, some rows will be duplicated. If we add the option ON ALL: - it will help to monitor the spill files - it will help to read the segment log files (see the commented DDL hawq_toolkit._hawq_log_segment_ext in the file $GPHOME/share/postgresql) I know that the option ON HOST and ON ALL were deprecated due to elastic runtime in HAWQ 2.x. It is related to the Hadoop architecture. However, how could we execute once a shell on each host of the cluster via an external web table? In this case, we are not using Hadoop FS, but local FS. Thanks, *Cyrille LINTZ*Advisory Solution Architect | Pivotal Europe South Mobile: + 33 (0)6 11 48 71 10 | [email protected]
