Thanks Noa,


So is it safe to assume it's always append a slash at beginning, and followed 
by a slash and other stuff?
Can you show me the code where it's construct the path? I couldn't find it in 
order to confirm the logic.


I used the following code to extract data source and I worked in my 
environment. Just not sure if the getDataSource() always return the format like 
that.

      StringTokenizer st = new StringTokenizer ( wds, "/", false ) ;

      if ( st.countTokens() == 0 ) {

         throw new RuntimeException ( "Invalid data source: " + wds ) ;

      }

      return st.nextToken () ;






At 2015-11-07 02:36:26, "Noa Horn" <[email protected]> wrote:

Hi,


1. Regarding the permissions issue - PXF is running as pxf user. So any 
operation on Hadoop needs to be done on files or directories which allow pxf 
user to read/write.

You mentioned changing pxf user to be part of hdfs, but I am not sure it was 
necessary. The PXF RPM already adds pxf user to hadoop group.


2. Regarding writable tables. The way to use them is to define a directory 
where the data will be written. When the SQL executes, each segment writes its 
own data to the same directory, as defined in the external table, but in a 
separate file. That's why the setDataSource() is needed when writing, because 
each segments creates its own unique file name. The changes you saw in the path 
is expected, it should be "<directory>/<unique_file_name>".


Regards,

Noa





On Fri, Nov 6, 2015 at 12:11 AM, hawqstudy <[email protected]> wrote:



Tried to set pxf user to hdfs in /etc/init.d/pxf-service and fix file owners 
for several dirs.
Now I got problem that the getDataSource() returns something strange.
My DDL is:

pxf://localhost:51200/foo.main?PROFILE=XXXX

In Read Accessor, getDataSource successfully get foo.main as the data source 
name.
However in Write Accessor, InputData.getDataSource() call shows /foo.main/1365_0
By tracking back the code I found pxf.service.WriteBridge.stream has:

    public Response stream(@Context final ServletContext servletContext,

                           @Context HttpHeaders headers,

                           @QueryParam("path") String path,

                           InputStream inputStream) throws Exception {




        /* Convert headers into a case-insensitive regular map */

        Map<String, String> params = 
convertToCaseInsensitiveMap(headers.getRequestHeaders());

        if (LOG.isDebugEnabled()) {

            LOG.debug("WritableResource started with parameters: " + params + " 
and write path: " + path);

        }




        ProtocolData protData = new ProtocolData(params);

        protData.setDataSource(path);

        

        SecuredHDFS.verifyToken(protData, servletContext);

        Bridge bridge = new WriteBridge(protData);




        // THREAD-SAFE parameter has precedence

        boolean isThreadSafe = protData.isThreadSafe() && bridge.isThreadSafe();

        LOG.debug("Request for " + path + " handled " +

                (isThreadSafe ? "without" : "with") + " synchronization");




        return isThreadSafe ?

                writeResponse(bridge, path, inputStream) :

                synchronizedWriteResponse(bridge, path, inputStream);

    }

The highlighted protData.setDataSource(path); set the data source from the 
expected one into the strange one.
So I keep looking for where the path is from, jdb shows
tomcat-http--18[1] print path
 path = "/foo.main/1365_0"
tomcat-http--18[1] where
  [1] com.pivotal.pxf.service.rest.WritableResource.stream 
(WritableResource.java:102)
  [2] sun.reflect.NativeMethodAccessorImpl.invoke0 (本机方法)
  [3] sun.reflect.NativeMethodAccessorImpl.invoke 
(NativeMethodAccessorImpl.java:57)
...

tomcat-http--18[1] print params

 params = "{accept=*/*, content-type=application/octet-stream, 
expect=100-continue, host=127.0.0.1:51200, transfer-encoding=chunked, 
X-GP-ACCESSOR=com.xxxx.pxf.plugins.xxxx.XXXXAccessor, x-gp-alignment=8, 
x-gp-attr-name0=id, x-gp-attr-name1=total, x-gp-attr-name2=comments, 
x-gp-attr-typecode0=23, x-gp-attr-typecode1=23, x-gp-attr-typecode2=1043, 
x-gp-attr-typename0=int4, x-gp-attr-typename1=int4, 
x-gp-attr-typename2=varchar, x-gp-attrs=3, x-gp-data-dir=foo.main, 
x-gp-format=GPDBWritable, 
X-GP-FRAGMENTER=com.xxxx.pxf.plugins.xxxx.XXXXFragmenter, x-gp-has-filter=0, 
x-gp-profile=XXXX, X-GP-RESOLVER=com.xxxx.pxf.plugins.xxxx.XXXXResolver, 
x-gp-segment-count=1, x-gp-segment-id=0, 
x-gp-uri=pxf://localhost:51200/foo.main?PROFILE=XXXX, x-gp-url-host=localhost, 
x-gp-url-port=51200, x-gp-xid=1365}"

So stream() is called from NativeMethodAccessorImpl.invoke0, that's something I 
couldn't follow. Is it making sense that "path" showing something strange? 
Should I get rid of protData.setDataSource(path) here? What is this code used 
for? Where is the "path" coming from? Is it constructed by X-GP-DATA-DIR and 
X-GP-XID and X-GP-SEGMENT-ID ?


I'd expect to get "foo.main" instead of "/foo.main/1365_0" from 
InputData.getDataSource() like what I got in ReadAccessor





At 2015-11-06 11:49:08, "hawqstudy" <[email protected]> wrote:

Hi Guys,


I've developed a PXF plugin and able to make it work to read from our data 
source.
However I implemented WriteResolver and WriteAccessor, however when I tried to 
insert into the table I got the following exception:



postgres=# CREATE EXTERNAL TABLE t3 (id int, total int, comments varchar) 

LOCATION ('pxf://localhost:51200/foo.bar?PROFILE=XXXX')

FORMAT 'custom' (formatter='pxfwritable_import') ;

CREATE EXTERNAL TABLE

postgres=# select * from t3;

 id  | total | comments 

-----+-------+----------

 100 |   500 | 

 100 |  5000 | abcdfe

     |  5000 | 100

(3 rows)

postgres=# drop external table t3;

DROP EXTERNAL TABLE

postgres=# CREATE WRITABLE EXTERNAL TABLE t3 (id int, total int, comments 
varchar) 

LOCATION ('pxf://localhost:51200/foo.bar?PROFILE=XXXX')

FORMAT 'custom' (formatter='pxfwritable_export') ;

CREATE EXTERNAL TABLE

postgres=# insert into t3 values ( 1, 2, 'hello');

ERROR:  remote component error (500) from '127.0.0.1:51200':  type  Exception 
report   message   
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException):
 Access denied for user pxf. Superuser privilege is required    description   
The server encountered an internal error that prevented it from fulfilling this 
request.    exception   javax.servlet.ServletException: 
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException):
 Access denied for user pxf. Superuser privilege is required (libchurl.c:852)  
(seg6 localhost.localdomain:40000 pid=19701) (dispatcher.c:1681)

Nov 07, 2015 11:40:08 AM com.sun.jersey.spi.container.ContainerResponse 
mapMappableContainerException


The log shows:

SEVERE: The exception contained within MappableContainerException could not be 
mapped to a response, re-throwing to the HTTP container

org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException):
 Access denied for user pxf. Superuser privilege is required

at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkSuperuserPrivilege(FSPermissionChecker.java:122)

at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkSuperuserPrivilege(FSNamesystem.java:5906)

at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.datanodeReport(FSNamesystem.java:4941)

at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getDatanodeReport(NameNodeRpcServer.java:1033)

at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getDatanodeReport(ClientNamenodeProtocolServerSideTranslatorPB.java:698)

at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)

at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)

at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)

at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)

at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:415)

at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)

at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)




at org.apache.hadoop.ipc.Client.call(Client.java:1476)

at org.apache.hadoop.ipc.Client.call(Client.java:1407)

at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)

at com.sun.proxy.$Proxy63.getDatanodeReport(Unknown Source)

at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getDatanodeReport(ClientNamenodeProtocolTranslatorPB.java:626)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:606)

at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)

at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)

at com.sun.proxy.$Proxy64.getDatanodeReport(Unknown Source)

at org.apache.hadoop.hdfs.DFSClient.datanodeReport(DFSClient.java:2562)

at 
org.apache.hadoop.hdfs.DistributedFileSystem.getDataNodeStats(DistributedFileSystem.java:1196)

at 
com.pivotal.pxf.service.rest.ClusterNodesResource.read(ClusterNodesResource.java:62)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:606)

at 
com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60)

at 
com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$ResponseOutInvoker._dispatch(AbstractResourceMethodDispatchProvider.java:205)

at 
com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher.java:75)

at 
com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:288)

at 
com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)

at 
com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:108)

at 
com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)

at 
com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84)

at 
com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1469)

at 
com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1400)

at 
com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1349)

at 
com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1339)

at 
com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:416)

at 
com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:537)

at 
com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:699)

at javax.servlet.http.HttpServlet.service(HttpServlet.java:731)

at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:303)

at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)

at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52)

at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)

at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)

at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:220)

at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:122)

at 
org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:505)

at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:170)

at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103)

at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:957)

at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116)

at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:423)

at 
org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1079)

at 
org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:620)

at 
org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:316)

at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

at 
org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)

at java.lang.Thread.run(Thread.java:745)

Since our datasource is totally indepedent from HDFS, I'm not sure why it's 
still trying to access HDFS and get superuser access.
Please let me know if there anything missing here.
Cheers








 





 


Reply via email to