[
https://issues.apache.org/jira/browse/HDFS-2461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13129852#comment-13129852
]
Mariappan Asokan commented on HDFS-2461:
----------------------------------------
I thought more on this. Since the Java globStatus() method already queried the
name node to retrieve the status information, for the sake of efficiency I
think we can change the function signature. Also, conforming to already
existing hdfsListDirectory(), I decided to return an array of structures rather
than array of pointers. This will enable reusing the existing C function
hdfsFreeFileInfo(). I also added the path filter function in the interface.
Filtering will be done in the C implementation. Following is the description
of the prototype of the single function:
{code:title=hdfs.h}
/**
* Path filter function prototype.
* @param pathName path name passed to this function.
* @return 0 if the path name has to be excluded; a non-zero otherwise.
*/
typedef int (*PathFilter)(const char * pathName);
/**
* hdfsGlobStatus - Get status for all HDFS file names that match a glob
* pattern. The returned result will be an array of hdfsFileInfo structures.
* The array is sorted by file names.
* The function hdfsFreeFileInfo() should be called to free this array and its
* contents.
* @param fs The configured filesystem handle.
* @param globPattern The glob pattern(as supported by Hadoop implementation) to
* match file names against.
* @param filter A path filter function. If this is NULL, no filtering will be
* done after glob expansion.
* @param numEntries pointer to an integer in which the number of entries in the
* returned array will be returned. This will be set to -1 in case of error.
* @return Returns a dynamically-allocated array of hdfsFileInfo structures; if
* there is no match or an error, a NULL value will be returned. An error
* condition can be identified by testing numEntries.
*/
hdfsFileInfo * hdfsGlobStatus(hdfsFS fs, const char *globPattern,
PathFilter filter, int *numEntries);
{code}
If anyone has any comments, please let me know.
Thanks.
> Support HDFS file name globbing in libhdfs
> ------------------------------------------
>
> Key: HDFS-2461
> URL: https://issues.apache.org/jira/browse/HDFS-2461
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: libhdfs
> Reporter: Mariappan Asokan
> Priority: Minor
>
> This is to enhance the C API in libhdfs to support HDFS file name globbing.
> The proposal is to keep the new API simple and return a list of matched HDFS
> path names. Callers can use existing hdfsGetPathInfo() to get additional
> information on each of the matched path. Following code snippet shows the
> proposed API enhancements:
> {code:title=hdfs.h}
> /**
> * hdfsGlob - Get all the HDFS file names that match a glob pattern. The
> * returned result will be sorted by the file names. The last element in the
> * array is NULL. The function hdfsFreeGlob() should be called to free this
> * array and its contents.
> * @param fs The configured filesystem handle.
> * @param globPattern The glob pattern to match file names against. Note that
> * this is not a POSIX regular expression but rather a POSIX glob pattern.
> * @return Returns a dynamically-allocated array of strings; if there is no
> * match, an array with one entry that has a NULL value will be returned. If
> * there is an error, NULL will be returned.
> */
> char ** hdfsGlob(hdfsFS fs, const char *globPattern);
> /**
> * hdfsFreeGlob - Free up the array returned by hdfsGlob().
> * @param globResult The array of dynamically-allocated strings returned by
> * hdfsGlob().
> */
> void hdfsFreeGlob(char **globResult);
> {code}
> Please comment on the above proposed API. I will start the implementation
> and testing. However, I need a committer to work with.
> Thanks.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira