Hi,

Please check and let me know.



Title: Big Data - Data Lake Developer

Location: Milpitas, CA

Contract: 1+yr

Interview: Phone and Skype



Candidates are required to take a java programming test with prime vendor on

Skype for 60-90minutes.



The Data Lake engineer should come from the background of a senior Hadoop

developer. The position will be in support of data management, ingestion,

and client consumption. This individual has an absolute requirement to be

well versed in Big Data fundamentals such as HDFS and YARN. More than a

working knowledge of Hive is required with understanding of

partitioning/reducer size/block sizing/etc. Preferably, the candidate has a

strong knowledge of Spark on either Python or Scala. Basic Spark knowledge

is required. Knowledge of other industry ETL tools (including No SQL) such

Cassandra/Drill/Impala/etc. is a plus.



The candidate should be comfortable with Unix and standard enterprise

environment tools/tech such as ftp/scp/ssh/Java/Python/SQL/etc.



Our Enterprise environment also contains a number of other Utilities such as

Tableau, Informatica, Pentaho, Mule, Talend, and others. Proficiency in

these is a plus.



Must have: Java + big data (hive hbase).



Big Data - Data Lake Developer:

Very strong in Core Java implementations. All the applications in Big Data

are written in core Java.

Must be able to code algorithms and try to reduce Big O in Java (O(n), O(n

log n), O(n2), etc). Eg: sorting, searching, etc.

Client is going to ask the candidate to implement code in core java on a

webex (audio, video and screen sharing).

Scoop is being used heavily. 90% of all the data imports are being done

using Scoop. Number of ways the data can be imported, parameters used,

distribute jobs, optimize the parameters, etc Very good understanding and

implementation experience of Hive and HBase (NoSQL) Wiring Bash scripts

(Shell Scripting) and working in Unix environment is mandatory most of the

unix commands, grep logs, write bash scripts and schedule them, etc

Excellent in RDBMS SQL. Client has access to many data sources Teradata, SQL

Server, MySQL, Oracle etc. The candidate must be able to easily connect and

run complex queries.

Python and Kafka are a plus

Java REST API implementation is a plus



-The Data Lake engineer should come from the background of a senior Hadoop

developer.

-The position will be in support of data management, ingestion, and client

consumption.

-This individual has an absolute requirement to be well versed in Big Data

fundamentals such as HDFS and YARN.

-More than a working knowledge of Hive is required with understanding of

partitioning/reducer size/block sizing/etc.

-Preferably, the candidate has a strong knowledge of Spark on either Python

or Scala.

-Basic Spark knowledge is required.

-Knowledge of other industry ETL tools (including No SQL) such

Cassandra/Drill/Impala/etc. is a plus.

-The candidate should be comfortable with Unix and standard enterprise

environment tools/tech such as ftp/scp/ssh/Java/Python/SQL/etc.

-Our Enterprise environment also contains a number of other Utilities such

as Tableau, Informatica, Pentaho, Mule, Talend, and others. Proficiency in

these is a plus.


Best Regards

*Mohit Arora*

*[email protected] <[email protected]>*

*201-620-9700 * 152 *

-- 
You received this message because you are subscribed to the Google Groups 
"Vendors" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/vendors.
For more options, visit https://groups.google.com/d/optout.

Reply via email to