The file name is constructed by hawq segment, each one with its own unique id: Check out build_file_name_for_write() in src/bin/gpfusion/gpbridgeapi.c
On Fri, Nov 6, 2015 at 5:03 PM, hawqstudy <[email protected]> wrote: > > Thanks Noa, > > So is it safe to assume it's always append a slash at beginning, and > followed by a slash and other stuff? > Can you show me the code where it's construct the path? I couldn't find it > in order to confirm the logic. > > I used the following code to extract data source and I worked in my > environment. Just not sure if the getDataSource() always return the format > like that. > > StringTokenizer st = new StringTokenizer ( wds, "/", false ) ; > > if ( st.countTokens() == 0 ) { > > throw new RuntimeException ( "Invalid data source: " + wds ) ; > > } > > return st.nextToken () ; > > > > > At 2015-11-07 02:36:26, "Noa Horn" <[email protected]> wrote: > > Hi, > > 1. Regarding the permissions issue - PXF is running as pxf user. So any > operation on Hadoop needs to be done on files or directories which allow > pxf user to read/write. > You mentioned changing pxf user to be part of hdfs, but I am not sure it > was necessary. The PXF RPM already adds pxf user to hadoop group. > > 2. Regarding writable tables. The way to use them is to define a > *directory* where the data will be written. When the SQL executes, each > segment writes its own data to the same directory, as defined in the > external table, but in a separate file. That's why the setDataSource() is > needed when writing, because each segments creates its own unique file > name. The changes you saw in the path is expected, it should be > "<directory>/<unique_file_name>". > > Regards, > Noa > > > On Fri, Nov 6, 2015 at 12:11 AM, hawqstudy <[email protected]> wrote: > >> >> Tried to set pxf user to hdfs in /etc/init.d/pxf-service and fix file >> owners for several dirs. >> Now I got problem that the getDataSource() returns something strange. >> My DDL is: >> >> pxf://localhost:51200/foo.main?PROFILE=XXXX >> In Read Accessor, getDataSource successfully get foo.main as the data >> source name. >> However in Write Accessor, InputData.getDataSource() call >> shows /foo.main/1365_0 >> By tracking back the code I found pxf.service.WriteBridge.stream has: >> >> public Response stream(@Context final ServletContext servletContext, >> >> @Context HttpHeaders headers, >> >> @QueryParam("path") String path, >> >> InputStream inputStream) throws Exception { >> >> >> /* Convert headers into a case-insensitive regular map */ >> >> Map<String, String> params = >> convertToCaseInsensitiveMap(headers.getRequestHeaders()); >> >> if (LOG.isDebugEnabled()) { >> >> LOG.debug("WritableResource started with parameters: " + >> params + " and write path: " + path); >> >> } >> >> >> * ProtocolData protData = **new** ProtocolData(params);* >> >> * protData.setDataSource(path);* >> >> >> >> SecuredHDFS.verifyToken(protData, servletContext); >> >> Bridge bridge = new WriteBridge(protData); >> >> >> // THREAD-SAFE parameter has precedence >> >> boolean isThreadSafe = protData.isThreadSafe() && >> bridge.isThreadSafe(); >> >> LOG.debug("Request for " + path + " handled " + >> >> (isThreadSafe ? "without" : "with") + " synchronization" >> ); >> >> >> return isThreadSafe ? >> >> writeResponse(bridge, path, inputStream) : >> >> synchronizedWriteResponse(bridge, path, inputStream); >> >> } >> The highlighted *protData.setDataSource(path); *set the data source from >> the expected one into the strange one. >> So I keep looking for where the path is from, jdb shows >> tomcat-http--18[1] print path >> path = "/foo.main/1365_0" >> tomcat-http--18[1] where >> [1] com.pivotal.pxf.service.rest.WritableResource.stream >> (WritableResource.java:102) >> [2] sun.reflect.NativeMethodAccessorImpl.invoke0 (本机方法) >> [3] sun.reflect.NativeMethodAccessorImpl.invoke >> (NativeMethodAccessorImpl.java:57) >> ... >> >> tomcat-http--18[1] print params >> >> params = "{accept=*/*, content-type=application/octet-stream, >> expect=100-continue, host=127.0.0.1:51200, transfer-encoding=chunked, >> X-GP-ACCESSOR=com.xxxx.pxf.plugins.xxxx.XXXXAccessor, x-gp-alignment=8, >> x-gp-attr-name0=id, x-gp-attr-name1=total, x-gp-attr-name2=comments, >> x-gp-attr-typecode0=23, x-gp-attr-typecode1=23, x-gp-attr-typecode2=1043, >> x-gp-attr-typename0=int4, x-gp-attr-typename1=int4, >> x-gp-attr-typename2=varchar, x-gp-attrs=3, x-gp-data-dir=foo.main, >> x-gp-format=GPDBWritable, >> X-GP-FRAGMENTER=com.xxxx.pxf.plugins.xxxx.XXXXFragmenter, >> x-gp-has-filter=0, x-gp-profile=XXXX, >> X-GP-RESOLVER=com.xxxx.pxf.plugins.xxxx.XXXXResolver, x-gp-segment-count=1, >> x-gp-segment-id=0, x-gp-uri=pxf://localhost:51200/foo.main?PROFILE=XXXX, >> x-gp-url-host=localhost, x-gp-url-port=51200, x-gp-xid=1365}" >> So stream() is called from NativeMethodAccessorImpl.invoke0, that's >> something I couldn't follow. Is it making sense that "path" showing >> something strange? Should I get rid of protData.setDataSource(path) here? >> What is this code used for? Where is the "path" coming from? Is it >> constructed by X-GP-DATA-DIR and X-GP-XID and X-GP-SEGMENT-ID ? >> >> I'd expect to get "foo.main" instead of "/foo.main/1365_0" from >> InputData.getDataSource() like what I got in ReadAccessor >> >> >> >> At 2015-11-06 11:49:08, "hawqstudy" <[email protected]> wrote: >> >> Hi Guys, >> >> I've developed a PXF plugin and able to make it work to read from our >> data source. >> However I implemented WriteResolver and WriteAccessor, however when I >> tried to insert into the table I got the following exception: >> >> postgres=# CREATE EXTERNAL TABLE t3 (id int, total int, comments varchar) >> >> LOCATION ('pxf://localhost:51200/foo.bar?PROFILE=XXXX') >> >> FORMAT 'custom' (formatter='pxfwritable_import') ; >> >> CREATE EXTERNAL TABLE >> >> postgres=# select * from t3; >> >> id | total | comments >> >> -----+-------+---------- >> >> 100 | 500 | >> >> 100 | 5000 | abcdfe >> >> | 5000 | 100 >> >> (3 rows) >> >> postgres=# drop external table t3; >> >> DROP EXTERNAL TABLE >> >> postgres=# CREATE WRITABLE EXTERNAL TABLE t3 (id int, total int, comments >> varchar) >> >> LOCATION ('pxf://localhost:51200/foo.bar?PROFILE=XXXX') >> >> FORMAT 'custom' (formatter='pxfwritable_export') ; >> >> CREATE EXTERNAL TABLE >> >> postgres=# insert into t3 values ( 1, 2, 'hello'); >> >> ERROR: remote component error (500) from '127.0.0.1:51200': type >> Exception report message >> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): >> Access denied for user pxf. Superuser privilege is required description >> The server encountered an internal error that prevented it from >> fulfilling this request. exception javax.servlet.ServletException: >> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): >> Access denied for user pxf. Superuser privilege is required >> (libchurl.c:852) (seg6 localhost.localdomain:40000 pid=19701) >> (dispatcher.c:1681) >> Nov 07, 2015 11:40:08 AM com.sun.jersey.spi.container.ContainerResponse >> mapMappableContainerException >> >> The log shows: >> >> SEVERE: The exception contained within MappableContainerException could >> not be mapped to a response, re-throwing to the HTTP container >> >> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): >> Access denied for user pxf. Superuser privilege is required >> >> at >> org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkSuperuserPrivilege(FSPermissionChecker.java:122) >> >> at >> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkSuperuserPrivilege(FSNamesystem.java:5906) >> >> at >> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.datanodeReport(FSNamesystem.java:4941) >> >> at >> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getDatanodeReport(NameNodeRpcServer.java:1033) >> >> at >> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getDatanodeReport(ClientNamenodeProtocolServerSideTranslatorPB.java:698) >> >> at >> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) >> >> at >> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616) >> >> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969) >> >> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049) >> >> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045) >> >> at java.security.AccessController.doPrivileged(Native Method) >> >> at javax.security.auth.Subject.doAs(Subject.java:415) >> >> at >> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) >> >> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043) >> >> >> at org.apache.hadoop.ipc.Client.call(Client.java:1476) >> >> at org.apache.hadoop.ipc.Client.call(Client.java:1407) >> >> at >> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229) >> >> at com.sun.proxy.$Proxy63.getDatanodeReport(Unknown Source) >> >> at >> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getDatanodeReport(ClientNamenodeProtocolTranslatorPB.java:626) >> >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> >> at >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) >> >> at >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >> >> at java.lang.reflect.Method.invoke(Method.java:606) >> >> at >> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187) >> >> at >> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) >> >> at com.sun.proxy.$Proxy64.getDatanodeReport(Unknown Source) >> >> at org.apache.hadoop.hdfs.DFSClient.datanodeReport(DFSClient.java:2562) >> >> at >> org.apache.hadoop.hdfs.DistributedFileSystem.getDataNodeStats(DistributedFileSystem.java:1196) >> >> at >> com.pivotal.pxf.service.rest.ClusterNodesResource.read(ClusterNodesResource.java:62) >> >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> >> at >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) >> >> at >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >> >> at java.lang.reflect.Method.invoke(Method.java:606) >> >> at >> com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60) >> >> at >> com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$ResponseOutInvoker._dispatch(AbstractResourceMethodDispatchProvider.java:205) >> >> at >> com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher.java:75) >> >> at >> com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:288) >> >> at >> com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147) >> >> at >> com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:108) >> >> at >> com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147) >> >> at >> com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84) >> >> at >> com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1469) >> >> at >> com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1400) >> >> at >> com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1349) >> >> at >> com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1339) >> >> at >> com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:416) >> >> at >> com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:537) >> >> at >> com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:699) >> >> at javax.servlet.http.HttpServlet.service(HttpServlet.java:731) >> >> at >> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:303) >> >> at >> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208) >> >> at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52) >> >> at >> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241) >> >> at >> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208) >> >> at >> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:220) >> >> at >> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:122) >> >> at >> org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:505) >> >> at >> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:170) >> >> at >> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103) >> >> at >> org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:957) >> >> at >> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116) >> >> at >> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:423) >> >> at >> org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1079) >> >> at >> org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:620) >> >> at >> org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:316) >> >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> >> at >> org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61) >> >> at java.lang.Thread.run(Thread.java:745) >> Since our datasource is totally indepedent from HDFS, I'm not sure why >> it's still trying to access HDFS and get superuser access. >> Please let me know if there anything missing here. >> Cheers >> >> >> >> >> >> >> >> >> >> > > > > >
