Big Data - Data Lake Developer | Milpitas, CA
Contract: 1+yr (renewed quarterly)
Interview: Phone and Skype
Candidates are required to take a java programming test with
prime vendor on
Skype for 60-90minutes.
The Data Lake engineer should come from the background of a
senior Hadoop
developer. The position will be in support of data management,
ingestion,
and client consumption. This individual has an absolute
requirement to be
well versed in Big Data fundamentals such as HDFS and YARN.
More than a
working knowledge of Hive is required with understanding of
partitioning/reducer size/block sizing/etc. Preferably, the
candidate has a
strong knowledge of Spark on either Python or Scala. Basic
Spark knowledge
is required. Knowledge of other industry ETL tools (including
No SQL) such
Cassandra/Drill/Impala/etc. is a plus.
The candidate should be comfortable with Unix and standard
enterprise
environment tools/tech such as ftp/scp/ssh/Java/Python/SQL/etc.
Our Enterprise environment also contains a number of other
Utilities such as
Tableau, Informatica, Pentaho, Mule, Talend, and others.
Proficiency in
these is a plus.
Must have: Java + big data (hive hbase).
Big Data - Data Lake Developer:
Very strong in Core Java implementations. All the applications
in Big Data
are written in core Java.
Must be able to code algorithms and try to reduce Big O in Java
(O(n), O(n
log n), O(n2), etc). Eg: sorting, searching, etc.
Client is going to ask the candidate to implement code in core
java on a
webex (audio, video and screen sharing).
Scoop is being used heavily. 90% of all the data imports are
being done
using Scoop. Number of ways the data can be imported,
parameters used,
distribute jobs, optimize the parameters, etc Very good
understanding and
implementation experience of Hive and HBase (NoSQL) Wiring Bash
scripts
(Shell Scripting) and working in Unix environment is mandatory
most of the
unix commands, grep logs, write bash scripts and schedule them,
etc
Excellent in RDBMS SQL. Client has access to many data sources
Teradata, SQL
Server, MySQL, Oracle etc. The candidate must be able to easily
connect and
run complex queries.
Python and Kafka are a plus
Java REST API implementation is a plus
-The Data Lake engineer should come from the background of a
senior Hadoop
developer.
-The position will be in support of data management, ingestion,
and client
consumption.
-This individual has an absolute requirement to be well versed
in Big Data
fundamentals such as HDFS and YARN.
-More than a working knowledge of Hive is required with
understanding of
partitioning/reducer size/block sizing/etc.
-Preferably, the candidate has a strong knowledge of Spark on
either Python
or Scala.
-Basic Spark knowledge is required.
-Knowledge of other industry ETL tools (including No SQL) such
Cassandra/Drill/Impala/etc. is a plus.
-The candidate should be comfortable with Unix and standard
enterprise
environment tools/tech such as ftp/scp/ssh/Java/Python/SQL/etc.
-Our Enterprise environment also contains a number of other
Utilities such
as Tableau, Informatica, Pentaho, Mule, Talend, and others.
Proficiency in
these is a plus.
*Garima Gupta | Technical Recruiter | Apetan Consulting LLC*
*Tel: 201-620-9700* <201-620-9700*> 133 | [email protected]
<[email protected]> *
| [email protected]
--
You received this message because you are subscribed to the Google Groups
"Vendors" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/vendors.
For more options, visit https://groups.google.com/d/optout.